AFRL-RH-WP-TR-2013-0063 


CYBERSPACE MATH MODELS 


Leslie M. Blaha, Daniel W. Repperger, Katheryn A. Farris, Fairul Mohd-Zaid, 
Paul R. Havig, Xiaoping Shen, Russell Francis, John P. McIntire, 
Lyndsey McIntire, David J. Rieksts 
Air Force Research Laboratory 


JUNE 2013 
Final Report 


Distribution A: Approved for public release; distribution is unlimited. 


See additional restrictions described on inside pages 


AIR FORCE RESEARCH LABORATORY 
711'" HUMAN PERFORMANCE WING, 
HUMAN EFFECTIVENESS DIRECTORATE, 
WRIGHT-PATTERSON AIR FORCE BASE, OH 45433 
AIR FORCE MATERIEL COMMAND 
UNITED STATES AIR FORCE 


NOTICE AND SIGNATURE PAGE 


Using Government drawings, specifications, or other data included in this document for any purpose 
other than Government procurement does not in any way obligate the U.S. Government. The fact that 
the Government formulated or supplied the drawings, specifications, or other data does not license the 
holder or any other person or corporation; or convey any rights or permission to manufacture, use, or 
sell any patented invention that may relate to them. 


Qualified requestors may obtain copies of this report from the Defense Technical 
Information Center (DTIC). 


AFRL-RH-WP-TR-2013-0063 HAS BEEN REVIEWED AND IS APPROVED FOR PUBLICATION 
IN ACCORDANCE WITH ASSIGNED DISTRIBUTION STATEMENT. 


\\signed\\ \\signed\\ 

LESLIE M. BLAHA JEFFREY L. CRAIG 

Battlespace Visualization Branch Chief, Battlespace Visualization Branch 
\\signed\\ 


WILLIAM E. RUSSELL 
Acting Chief, Warfighter Interface Division 


This report is published in the interest of scientific and technical information exchange, and its 
publication does not constitute the Government’s approval or disapproval of its ideas or findings. 


d 
REPORT DOCUMENTATION PAGE 


The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, searching existing data 
sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, 
including suggestions for reducing this burden, to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports (0704-0188), 1215 Jefferson Davis Highway, 
Suite 1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if 
it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 


1. REPORT DATE (DD-MM-YY) 2. REPORT TYPE 3. DATES COVERED (From - To) 
4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER 
CYBERSPACE MATH MODELS FA8650-08-D-6801 
5b. GRANT NUMBER 
5c. PROGRAM ELEMENT NUMBER 
61102F 
6. AUTHOR(S) 5d. PROJECT NUMBER 
Leslie M. Blaha, Daniel W. Repperger, Katheryn A. Farris, Fairul Mohd-Zaid, Paul R. 2313 
Havig, Xiaoping Shen, Russell Francis, John P. McIntire, Lyndsey McIntire, David J. . TASK NUMBER 
Rieksts HC 
2313HC57 
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION 
711 HPW/RHCV REPORT NUMBER 


Battlespace Visualization Branch 
Wright-Patterson Air Force Base, OH 45433 


9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSORING/MONITORING 
Air Force Materiel Command AGENCY ACRONYMS) 
Air Force Research Laboratory AFRL/RHCV 


th e 
711” Human Performance Wing 11. SPONSORING/MONITORING 
Human Effectiveness Directorate AGENCY REPORT NUMBER(S) 


Warfighter Interface Division AFRL-RH-WP-TR-2013-0063 
Battlespace Visualization Branch 
Wright-Patterson Air Force Base, OH 45433 


12. DISTRIBUTION/AVAILABILITY STATEMENT 
Distribution A: Approved for public release; distribution unlimited. 


13. SUPPLEMENTARY NOTES 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715 
14. ABSTRACT 


The goal of this effort was to explore ways of characterizing the complexity, performance, vulnerability, and dynamic properties of networks 
and complex systems. Techniques investigated included information theory measures and stochastic resonance, together with graphical and 


statistical assessments. Network persistance was characterized by the Hurst parameter in a network model based on fractional brownian 
motion, generalized by fast Fourier transform and studied with wavelet analysis. Network uncertainty was cahracterized with approximate 
entropy. General applicability to other complex systems were were studied in the areas of sunspot cycles, chatbot detection, genetic data, 
and image processing. 

15. SUBJECT TERMS 


Complex Networks, Visualization, Image Fusion, Cyberspace, Information Theory 


16. SECURITY CLASSIFICATION OF: 17. LIMITATION 18. NUMBER OF 19a. NAME OF RESPONSIBLE PERSON (Monitor) 


a. REPORT | b. ABSTRACT | c. THIS PAGE OF ABSTRACT: Leslie M. Blaha 
Unclassified | Unclassified | Unclassified SAR 19b. TELEPHONE NUMBER (Include Area Code) 


Standard Form 298 (Rev. 8-98) 
Prescribed by ANSI Std. Z39-18 


THIS PAGE INTENTIONALLY LEFT BLANK 


Table of Contents 


EXO CUELV E SUIIIALY sie isjs iadiveysonsvelessbevbevsacmaciystatiaevcavedcesssustedad tata slaWaaediesasios bias athe vansladeaties 1 
Stochastic Resonance-a Nonlinear Control Theory Interpretation..............::ccsccssceseeeereeeeees 4 
Graphical and Statistical Communication Patterns of Automated Conversational Agents in 
Collaborative Computer-Mediated Communication SysteMs ...........csscseeseeeeeseceeenaeeneeees 15 
Quantify Effects of Long Range Memory on Predictability of Complex Systems.............. 22 
On Using Information-Theoretic Quantities in Characterization Dissimilarity of DNA 
ROUEN staves astride ds Sepec daca ody Suncuebnncaa tudes do tucedlacntety Mucuenuvasgudeedadcaidenaatod tre renescarebeseldamicseteteatons 26 
Runge Phenomenon: A Virtual Artifact in Image Processing.............::ccssseesseceseceeeeeeneeeeeeees 33 


Appendix A: Investigation of Comples Networs and Information Theory at WPAFB........ 40 
Appendix B: Fractional Calculus-A New Paradigm for Understanding Complex Systems 61 
Appendix C: Sparse Statistical Data Analysis Based on the L;=norm = Case Study ......... 80 
Appendix D: Shannon Entropy and Relatives: A Brief Review and New Development .... 87 
Appendix E: Digital Signal Representation with Slepian Series .............ecesesseceneereeeeeeeeees 93 


Appendix F: Complexity Analysis and Information Visualization ............ceseseseeeeeereeees 100 


iii 


1.) 
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Executive Summary 
Final Report to Dr. Robert J. Bonneau, AFOSR Program Officer 


Laboratory Research Initiation Request Project: Quantifying Cyberspace Situational 
Awareness, Performance, Vulnerability, and the Design of Optimal Cyber Attacks in 
Complex Networks 


Contract/Grant FA8650-08-D-6801, September 2008 to August 2011 


Final accomplishments: This final progress report describes work conducted under the 
AFOSR Laboratory Request to Initiate Research (LRIR). Our goal was to quantify 
cyberspace vulnerability through a probability model with multiple parameters. We 
primarily focused on two key areas: 1.) Quantifying the level of network persistence (ie. 
long range memory), and, 2.) Quantifying the level of network uncertainty (ie. 
predictability). 


Quantifying the level of network persistence (long range memory). Fractional 
Brownian motion (fBm) is chosen as a basic design model for complex networks. It has 
been used as a theoretical framework to study non-stationarity and long- range 
dependence. Fractional Brownian motion is characterized by the Hurst parameter (H) 
which requires sophisticated techniques that often yield ambiguous results. We utilized a 
Fast Fourier Transform (FFT) based method to generalize fBm. These synthetic fBm data 
are then analyzed by using wavelet multi- resolution analysis and confirmed with 
extensive numerical simulations. Further details are reported in the list of archival 
publications and conference talks. 


Quantifying the level of network uncertainty (predictability). To quantify network 
uncertainty, we used a powerful tool based on information theory called approximate 
entropy (ApEn). Our study began with a literature survey on information entropy and was 
initiated in the summer of 2010. Initial numerical experiments were designed to study the 
relationship between the persistence parameter H and the uncertainty parameter ApEn. 
Further details are reported in the list of archival publications and conference talks. 


Changes in research objectives, if any: Yes. The original proposal emphasized specific 
applications, such as modeling cyberspace with an actual cyber network. However, due to 
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the untimely passing of Dr. Daniel Repperger, we made some modifications such as taking 
amore general approach to assessing non-stationary time-series data with long- range 
memory through the analysis of sunspot cycles and stock market prices. The underlying 
structure of a complex network is the same whether you are analyzing the stock market or 
the internet. Hence, our studies still analyzes characteristics of a complex network and the 
data we used was more from a motivation of practicality because it was publicly available, 
parsed, and “cleaned up”. 


Change in AFOSR program manager, if any: Yes. The project has been managed by 
Dr. Paul R. Havig and Ms. Katheryn Farris since the original laboratory PI, Dr. Daniel 
Repperger of the 711" HPW, deceased on January 03, 2010 and Co-PI, Dr. Jeffrey 
McDonnell of AFIT, was deployed. We engaged in strong collaboration with Dr. 
Xiaoping Annie Shen, Mathematics Professor of Ohio University. She was sponsored by 
Summer Faculty Fellowship Program (SFFP) and worked with us full-time during the 
summers of 2009, 2010 and 2011. She also contracted out part of her time from October, 
2010 to May, 2011 to continue research efforts under the complex networks domain. 


Extensions granted or milestones slipped, if any: None 
Archival Publications, Conference Presentations and Other: 
Archival publications 


1. X. A. Shen, K. A. Farris and P. R. Havig, “Quantifying Effects of Long-Range 
Memory on Predictability of Long-Range Systems,” Proceedings of the 2011 National 
Aerospace and Electronics Conference (NAECON), July 20-22, 2011, Dayton, OH. 


2. D. W. Repperger and K. A. Farris, Stochastic resonance — interpretations from a 
nonlinear control theory perspective, International Journal of Systems Science, July, 
2010. 


3. J. P. McIntire, P. R. Havig, K. A. Farris and L. K. McIntire, “Graphical and Statistical 
Communication Patterns of Automated Conversational Agents in Collaborative 
Computer-Mediated Communication Systems,” Proceedings of the 2010 National 
Aersospace and Electronics Conference (NAECON), July 14-16, 2010, Dayton, OH. 


Conference presentations 


4. X. Shen, Sparse statistical data analysis based on the L1-norm - case study, Special 
session on Sparse Data Representations and Applications, AMS Spring Southeastern 
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Section Meeting, Statesboro, GA, March 18-20, 2011. 


5. X. Shen, Digital signal representation with Slepian series, Special session on Control 
Systems and Signal Processing, AMS Spring Southeastern Section Meeting, Statesboro, 
GA, March 18-20, 2011. 


6. X. Shen, K. A. Farris, D. Riesksts, and P. R. Havig, Shannon entropy and relatives: A 
brief review and new development, Special Session on Computational and Applied 
Mathematics, AMS Section Meeting, Richmond, VA, November 6-7, 2010. 


7. D. W. Repperger, J. S. Shattuck, and K.A. Farris, Complex Network Analysis in Power 
Grids, 35th Annual Dayton-Cincinnati Aerospace Sciences Symposium, March 9, 2010, 
Dayton, Ohio. (*presented posthumously) 


8. D. W. Repperger, K. A. Farris and R. Bradford, “Fractional Calculus — A Paradigm for 
Understanding Complex Systems,” 34th Annual Dayton-Cincinnati Aerospace Sciences 
Symposium, March 3, 2009, Dayton, Ohio. 


9. D. W. Repperger, K. A. Farris and R. Bradford, “Computational Studies of Complex 
Networks Using Information Flow,” 34th Annual Dayton-Cincinnati Aerospace Sciences 
Symposium, March 3, 2009, Dayton, Ohio. 


Other presentations 


10. X. A. Shen, D. J. Rieksts, K. A. Farris and P. R. Havig, (Poster) Visualizing 
Complexity of Sunspot Cycles Via Wavelet Transform, the Summer Intern Research 
Day, August 3, WPAFB 2010, Dayton, OH. 


11. K. A. Farris and P. R. Havig, Complexity Analysis and Information Visualization, 
Ohio University, March 5, 2010, Athens, Ohio. 


12. D. W. Repperger, Studies on Information Theoretic Variables and Decision Making, 
seminar talk at Quantitative Psychology Program, the Ohio State University, October 19, 
2009, Columbus, Ohio. 


New discoveries, inventions, or patent disclosures during this reporting period: 


Invention: Fractal Filter Apparatus Based on Wavelets, D. W. Repperger, X. A. Shen, A. 
R. Pinkus and K. A. Farris. US Air Force Invention Number: AF# 1092, approved 
11/16/2009. Patent application pending. 
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Stochastic resonance (SR) is an effect that has been known (Benzi, R., Sutera, A., and Vulpiani, A. (1981), 
‘The Mechanism of Stochastic Resonance’, Journal of Physics, A14, L453—L457) for almost three decades and 
has been extensively studied in biology, statistics, signal processing and in numerous other eclectic areas 
(Wiesenfeld, K., and Moss, F. (1995), ‘Stochastic Resonance and the Benefits of Noise: From Ice Ages to 
Crayfish and Squids’, Nature, 373, 33-36). Herein, a nonlinear control theory analysis is conducted on how to 
better understand the class of systems that may exhibit the SR effect. Using nonlinear control theory methods, 
equilibrium points are manipulated to create the SR response (similar to shaping dynamical response in a phase 
plane). From this approach, a means of synthesising and designing the appropriate class of nonlinear systems is 
introduced. New types of nonlinear dynamics that demonstrate the SR effects are discovered, which may have 
utility in control theory as well as in many diverse applications. A numerical simulation illustrates some powerful 


attributes of these systems. 


Keywords: nonlinear dynamics; stochastic resonance; biological systems 


1. Introduction 


Stochastic resonance (SR) provides an enabling prop- 
erty to a certain class of nonlinear systems, which may 
be viewed as an optimisation of system design, and 
commonly occurs in nature. In this article, a method of 
synthesising such systems is conducted by manipula- 
tion of equilibrium points, which is a common practice 
in nonlinear control theory. The term ‘resonance’ 
refers to a unimodal peak of the SR curve displayed, 
for example, in Figure 1. The term ‘stochastic’ 
indicates noise injection into a system. The vertical 
axis in Figure 1 portrays an output variable or 
property of a system, which is desired to be optimised. 
In the signal processing literature, for example, the 
y-axis could represent the signal-to-noise ratio (SNR) 
gain of a system (Loerincz, Gingl, and Kiss 1996; 
Chapeau-Blondeau 1997). An SR is a property of the 
input-output characteristics of a nonlinear system, 
which is shown in the sequel. The analysis and 
synthesis of nonlinear systems can be approached a 
number of ways, e.g. by sliding mode control (Diong 
2004) and passivity methods (Son, Yang, Jo, Shim, and 
Seo 2004). Manipulating the response of a nonlinear 
system is a topic of interest, e.g. mitigating a limit cycle 
using specified feedback (Repperger, Buckholtz, and 
Daniels 2004). Some related problems in the control 
theory area involve chaotic systems (Serletis and 


Andreadis 2000; Apostolou and King 2002; Liao and 
Chen 2003; Wei and Billing 2004; Wang and Ip 2005). 
Another property of nonlinear systems, which will be 
shown later, involves the bifurcation characteristics of 
the locus of equilibrium points. Bifurcation studies are 
investigated in singular systems (Li and Liu 1999) 
within biological systems (Hritonenko and Yatsenko 
2007). Equilibrium points can be managed (Shin and 
Chung 2002), a topic to be discussed in the ensuing 
analysis. Finally, the properties of systems that exhibit 
SR appear in the most basic forms of biological 
response (Repperger, Phillips, and Neidhard 2001) and 
can even be linked to the manner in which image 
processing is performed in nature (Xu, Jiang, Wu, and 
Repperger 2009). It should be mentioned that the SR 
effect can also occur in leaky integrate-and-fire neuron 
models, which have been investigated in Repperger 
et al. (2001) and are outside the scope of the procedures 
discussed herein. 

To continue the discussion of the manner how SR 
works, the horizontal axis usually represents the power 
of the noise intensity in the input signal as indicated in 
Figure 1. Figure 2 presents a block diagram description 
of a commonly studied SR system. The input signal is 
S(t) and is typically deterministic; the measured output 
signal is y(t). The noise term 7(f) is usually assumed 
to be zero mean and Gaussian, but this is not a 
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Figure 1. SR resonance effect. 
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Figure 2. Block diagram of SR process. 


strict requirement. The dynamical equation in Figure 2 
is classically described by 


WD) = —y* — @) + SI) + 0) (1) 


By measurement of the scalar variable y(t), sub- 
jected to an injected noise n(t), the goal is to correctly 
determine if the deterministic signal S(t) may be in a 
high or low state. This model is very akin to how 
biological systems work and respond in nature. Two 
physical viewpoints on applications are presented on 
how the SR effect is traditionally envisioned in the 
literature. 


2. A biological and statistical viewpoint on why the 
SR effect occurs 


At first it appears counter intuitive that the system in 
Equation (1) may gain some advantage in its output 
capabilities related to y(t) when adding the noise 7(7) 
to the input signal S(¢). From the extant literature, 
however, there are two important physical applications 
that demonstrate that the shape of the curve in 
Figure 1 can be realised and used to optimise system 
performance. The first perspective is in terms of correct 
statistical detection of the signal S(t) being either in a 
high or low state. Since this article has the focus on the 
nonlinear control theory aspects of the design of SR 
systems, the prior applications will be briefly described 
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Figure 3. Application | — subthreshold identification of S(¢). 


here with more 
Appendix 1. 


specific material presented in 


2.1. Application 1: correct detection of S(t) when it 
is subthreshold 


In Figure 3, the input signal S(f) is a trapezoid and is 
below a threshold /. The resonance curve in Figure | 
for this application has the vertical axis being propor- 
tional to the correct detection of S(t) when it is truly in 
a high state. Appendix 1 describes these details (cf 
Repperger, Phillips, Berlin, Neidhard-Doll, and Haas 
2005 for a haptics study related to Figure 3, which 
represents a more practical application of this effect 
involving human subjects and subliminal force signals 
that helped improve tracking performance and situa- 
tional awareness). 


2.2. Application 2: the bipotential well problem 


The second application occurs in the physics literature. 
In Figure 4, the signal S(t) may represent the state of 
the ball in a bipotential well. The ball can be in either a 
high or low state (right or left well) and is subthreshold 
by having insufficient kinetic energy of motion as 
compared to the potential energy necessary to switch 
states. Similar to Application 1, the resonance curve in 
Figure 1 has the vertical axis proportional to the 
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Figure 5. The equilibrium points from Figure 4 that induce 
the SR effect to occur. 


correct placement of S(t) in the high state (the right 
most potential well) when S(t) is high. Again, these 
details are described further in Appendix 1. 


3. A physics perspective of the SR system and 
equilibrium points 
Figure 4 presents a physical platform to examine how 
the dynamics of SR arises and its relationship to 
Equation (1). Examining the homogeneous portion of 
Equation (1), the equilibrium points can be determined 
by letting y(t) > oo or letting y(t) > 0. 
Thus 


WD) =" -#) > 0 > y=0, 
y = ta as equilibrium points 


(2) 


and Figure 5 portrays a plot of these equilibrium 
points in a phase plane. The velocity flow arrows on 
the y-(horizontal) axis indicate that the equilibrium 
point y=0 is unstable, and the two equilibrium points 
y=a are both stable. The equilibrium diagram in 
Figure 5 concurs with Figure 4 where, physically, the 
unstable equilibrium (the origin) is coincident with the 
top of the small hill between the two potential wells. 
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The bottoms of the two potential wells represent the 
location of the two stable equilibrium points at the 
locations y= a. 

From a physics perspective, one can also view the 
dynamics of the homogeneous portion of Equation (1) 
as a force equation associated with a potential energy 
function V(y). This paradigm describes the potential 
well in Figure 4 as the spatial integral of the negative 
force gradient. For example, let V(y) represent the 
potential function, then the homogeneous portion 
of Equation (1) is given by 


OV(y) _ 2 92 
a [-Wy — @°)] 
= —force spatial gradient, (3) 
which integrates to 
a 1 
Ve ee ps Aer 4 + 4 
(y) av tay te (4) 


where c is constant. Thus V(y) of Equation (4) 
represents the quartic potential energy function curve 
in Figure 4, derived from systems that show the SR 
effect. This concept can now be generalised to other 
examples. 


4. Generalising SR systems using nonlinear dynamics 
principles 

From the discussions in Appendix 1, the following 

definitions will apply to classes of nonlinear systems 

that would exhibit the SR effect: 


Definition: A nonlinear system y(t) = f(y, S(2), n(d) 
will exhibit an SR effect if the function f(-) has three or 
more equilibrium points. The equilibrium points must 
have an unstable equilibrium point interlaced between 
two stable equilibrium points, similar to Figure 5. 


Discussion: It is clear that f(-) must be nonlinear. In 
order to have three or more equilibrium points requires 
¥(t) = f(y, S(t), n(t)) = O(y°), which must be of order 
3 or higher, excluding linear systems. As in Figure 5, 
the domain of the state space, where this SR effect will 
occur is locally near the three equilibrium points. This 
definition can now be expanded to a much wider class 
of systems that occur in nonlinear dynamics. 


5. Relating the SR effect to a super critical pitchfork 
bifurcation 


By relating the class of systems that may exhibit the 
SR effect to properties of their equilibrium points as 
described in the prior section, generalisations now can 
be obtained to different systems in the areas of 
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Figure 6. The equilibrium points from Equation (5). 


nonlinear dynamics (Strogatz 1995). A familiar exam- 
ple of systems that have the same property, as in the 
definition given previously, include nonlinear dynami- 
cal systems that fall in a class of equilibrium points 
determined by a critical pitchfork bifurcation. For 
example, consider the nonlinear system with dynamics 
(a well-studied problem): 


(5) 


with Figure 6 describing the set of all equilibrium 
points of Equation (5) if r>0. It is clear that 
Equation (5) is akin to the prior discussion on the 
classical equation that produces the SR effect. The 
relationship of Equation (5) and systems that produce 
the SR effect can now be described in terms of 
the concepts of a vector flow field. 


x=rx—-x= xr — x) 


5.1. A vector flow field representation of 
Equation (5) 

For a system of the form x = f(x) to determine the 
equilibrium points when f(x) may be a nonlinear 
function, it can be obtained graphically by plotting 
F(x) in the phase plane of x versus x. The intersections 
of f(x) with the x-axis (x = 0) represent equilibrium 
point solutions, which can be determined graphically. 
For Equation (5), the three possible cases r<0, r=0, 
and r>0 are displayed in Figure 7. The case r>0 
corresponds to systems that show the SR effect. If one 
now considers r as the independent variable and a 
graph is now made of the class of all equilibrium points 
on the vertical axis, as in Figure 8, the region where 
the SR effect occurs appears in the right half of that 
diagram. 

Thus in Figure 8, which includes all three cases 
(r<0, r=0, and r>0), one can design the two stable 
equilibrium points by adjusting the r>0 value to cover 
any area of the phase plane desired. From Figure 6, 
this extends the domain and range of the SR effect to 
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Figure 7. Vector flow fields for different values of r in 
Equation (5). 


Locus of 
equilibrium x 
Points 


Design of 
equilibrium 
Points 


Stable 


Stable “Unstable 


Se fn 


SR effect occurs here 


Figure 8. The pitchfork bifurcation representation of the 
class of all equilibrium points. 


any portion of the phase plane required for a particular 
application. 


6. Alternative properties of SR systems 


It is not just a coincidence that the V(v) function in 
Equation (4) is an even function (symmetric about 
the vertical axis in Figure 4). This means replacing y by 
—y yields the same equations and similar equilibrium 
points. This is a property of systems that give rise to 
super critical pitchfork bifurcations. Figure 8 also 
shows this requisite symmetry in the pitchfork shape. 
The term ‘super’ in super critical refers to the —x* term 
in the homogeneous term of x = —x? + a’x, which 
provides global stability as x + +00. This is a conse- 
quence of the two stable equilibrium points x=-a, 
which are attracting for trajectories that have |x| >a, 
thus ensuring global stability. 


7. Other nonlinear systems that exhibit the SR effect 


Once the methodology for designing nonlinear systems 
that exhibit the SR effect is obtained, the results can 
now be easily generalised to other nonlinear systems. 
Several other applications that also are derived from 
nature are discussed. Some of these applications are 
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Figure 9. Example | equilibrium points. 


well-known in the nonlinear dynamics area 
(Strogatz 1995), but have not been previously discussed 
within the context of the SR effect, which is a type of 
optimisation. This demonstrates that the subsequent 
applications also have some optimisation properties in 
their outputs of their physical structures, which may 
not be known at this time. Several other applications 
have already demonstrated that they exhibit pitchfork 
bifurcations involving the locus of their equilibrium 
points. The first application involves models of 
magnets and neural networks (Palmer 1989). 


7.1. Example I — mechanical models of magnets and 
neural networks 


From Palmer 1989, the following equation is known 
to exhibit a pitchfork bifurcation as the parameter £ is 
varied with B>1: 


xX =-—x-+ Btanhx. (6) 


To show this result and to determine the equilib- 
rium points, a plot versus x is made of both sides of 
the equation x=ftanhx in Figure 9 for common 
intersection points as shown (setting x = 0). The pair 
of equilibrium points +6* have the requisite symmetry. 
Figure 10 shows the resulting pitchfork bifurcation 
diagram. Again, the design of the two stable equilib- 
rium points can be placed anywhere in the phase plane 
desired through the choice of the 6>1 parameter in 
Equation (6). 

The next application is a biological example and 
includes a bifurcation and a catastrophe cusp curve. 
The model is for a sudden outbreak of insects and 
shows the interaction of insects with their food supply 
and the effect of predators (Ludwig, Jones, and 
Holling 1978). 
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Figure 10. The pitchfork bifurcation for Example 1. 


7.2. Example 2 — the interaction of insects with their 
food supply and predators 

Let M(t) represent the insect population, which is 

affected by its growth rate determined by food supply 

and death rate determined by predators: 


M(t) = growth rate death rate (predator attacks). (7) 


The growth rate includes the typical logistics model 
that reaches saturation. 


N 
Growth rate = RN(I - z): (8) 
Thus, the growth is initially exponential 


(M(t) = RN(t)) and after K individuals appear, the 
growth rate then turns negative and the population 
saturates. To model the death rate, the predators are 
birds that start to reduce M(t) after they reach 
the threshold N> A and is modelled via: 


BN? 
Death rate = A2 + N2 (9) 
Thus, Equation (7) now has the complete form: 
; N BN? 
N=RN(1 : 10 
( K ) A? + N? oe 


A normalisation is conducted by the choice of 
variables (Strogatz 1995) of x=4,1r=#,r=%, 
k= * and Equation (10) simplifies to the dimensionless 


form: 
dx x x 
reali Caer? ae OS 


To find the equilibrium points of Equation (11) 
requires the simultaneous solution of (dx/dt = 0) 


x x 
XVo(x) = x1 (1 - “) = Te = xy1(Xx). 


(11) 


(12) 
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Figure 12. Plots of (v2(x)—y\(x)) from Figure 11. 


The equilibrium point at x=0 will be considered 
separately from the case x>0, which is now explored 
by the functions y;(x) and y2(x) for their common 
intersections. If A is large, Figure 11 shows the 
simultaneous plots of the left- and right-hand sides of 
Equation (12) demonstrating three equilibrium point 
intersections. They are shown in Figure 12 and they 
satisfy the properties, where an SR effect will occur 
(two stable equilibrium points separated by an unsta- 
ble equilibrium point). The bifurcation curve is very 
interesting because it is a slice of a cusp of a 
catastrophe curve. Figure 13 portrays the bifurcation 
curve, where the SR effect will occur and Figure 14 
shows this curve as a top view projection of a cusp 
catastrophe surface. 

Finally, it is noted in Figure 12 that the symmetry 
property now disappears about the origin for the set 
of all equilibrium points (satisfying x=— x) are not 
always true. The next example also loses this symmetry 
when small variations are made in the assumptions of 
the underlying dynamics using the classical SR model 
of Equations (1) and (5). 
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Figure 14. Catastrophe cusp and bifurcation. 


7.3. Example 3 — an approximation to the classical 
SR method of Equations (1) and (5) 


A small variation of the homogeneous form of 
Equations (1) and (5) can also produce the SR effect, 
which is not symmetric, but may have utility for certain 
other applications. Starting with 


x=h+rx—x?. (13) 


If h=0, this is identical to Equation (5). However, 
if A small, one would expect the SR effect may still 
occur. To investigate this small variation to Equation 
(5), plot y=—hA and y=rx—.x* for common points of 
intersection (the equilibrium points of Equation (13)) 
as shown in Figure 15. Three equilibrium points will 
occur (two stable interlaced between an unstable) if 
the term / is smaller than h, given by 

2r |r 


h<h,=—,|z 


Ue (14) 


Appendix 2 derives this result. Figure 16 portrays 
the bifurcation curve for the equilibrium points and the 
region, where the SR effect will occur in h—r space. 

Finally, additional nonlinear systems are shown 
to generate the SR effect from a number of other 
procedures (Repperger, Alderman, and Djouadi 2006) 
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by manipulating the location of the 


equilibrium points. 


spatial 


7.4. Other systems that exhibit the SR effect 


As mentioned earlier, there are almost an infinite 
number of ways to generate nonlinear systems that 
show the SR effect. In Repperger et al. (2006) the 
attribute of enhancing SNR gain was leveraged by 
using the SR effect. Some of the systems with nonlinear 
dynamics are given below. In all cases, the phase plane 
was partitioned into rectangular regions, with the 
pattern of two stable equilibrium points being sepa- 
rated by a single unstable equilibrium point. This 
technique of the ‘management of the physical place- 
ment of equilibrium points or manipulation of the 
shape of the potential function’ is not unlike applica- 
tions in control theory, which typically would be 
concerned with placing closed loop poles in certain 
regions of a pole-zero diagram. Additional systems 
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that show the SR effect 
Gi4.1 > Gy): 


include (b;.,;>5;>0, 


x = —(x + a, )(x + a2)(x — a3)(x — a4)(xX — a5) (15) 


X= —(x+a))(X+ d2)(X + a3)(X — d4)(X — a5)(X — a6) 
x (x- a7) + ag(e~"'* _ eB) + ao(e73* _ gr) 
(16) 


and so on. Hence the types of nonlinear systems that 
can show the SR effect are practically unlimited 
because the a; and 5; terms in Equations (15) 
and (16) are somewhat arbitrary. 


8. A numerical simulation to show the SR effect 


To show the potential efficacy of the SR effect, it is well 
known in the extant literature (Loerincz et al. 1996; 
Chapeau-Blondeau 1997) that the SR effect can 
significantly amplify the SNR of certain signals, espe- 
cially under high noise conditions. This provides an 
explanation in biology on how creatures in nature can 
detect the presence of mating partners or predators with 
abnormally high values of sensitivity. The nonlinear 
sensing systems of these animals may have dynamics, 
where this SR effect occurs to promote sensitivity and 
specificity in the identification of objects. 

The question of how much signal-to-noise ampli- 
fication can be achieved presents an intriguing issue. 
It has been hypothesised that SR may amplify the SNR 
of an input signal S(t) a factor of 10,000 or more 
(Loerincz et al. 1996) and the following example was 
selected to examine this effect through an exhaustive 
numerical simulation (Repperger et al. 2006). 

With reference to Figure 2, the signal S(t) is 
selected to be the sum of two sine waves, i.e. 


S(t) = sin(2x(1)t) + sin(27(4)2). (17) 


Thus, the deterministic test signal input S(f) in 
Figure 2 consists of the sum of two sine waves with 
unity amplitude and fundamental frequencies at 1 Hz 
and 4Hz. This is a biperiodic signal as considered by 
Khovanov and McClintock (2007), where it was shown 
that a high frequency bias can control synchronisation 
at the lower frequency. In a subsequent work 
(Khovanov 2008), it was shown that improvement of 
SNR can be obtained by increasing array elements and 
efficiency of information processing. Figure 17 shows 
the test signal S(7) in real time and Figure 18 portrays a 
fast Fourier transform (FFT) of S(£) indicating spectral 
power identified at 1 Hz and 4Hz. The noise term 7(f) 
in Figure 2 is zero-mean Gaussian with variance o, 
which will be varied. The objective is to measure 
only y(t) in Figure 2 and attempt to identify the 
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Figure 17. Test signal S(7) in real time. 
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Figure 18. FFT of S(#) test signal. 


original S(t) signal in Equation (17). As the variance o7 
(power) in the noise 7(f) increases, it becomes increas- 
ingly difficult to identify in the FFT of y(t), where the 
two principal components at 1 Hz and 4Hz appear. 
However, the SR filter behaves much better. To show 
this result for the case of the output of the SR filter, 
Figure 19 shows the power spectra FFT of y(t) for the 
case o= 2000 and Figure 20 for the case of o = 9000. 
To provide a comparison to the performance of this 
SR filter, a unity gain block (which is the case without 
the SR effect) is selected as a baseline comparison and 
is determined by Equation (18). In all the spectra plots, 
the frequency resolution Af=0.25Hz=1/(total real 
time simulation in seconds). Also At=4s/1024 points. 


8.1. Unity gain block as a baseline and comparison 
to the SR filter 


y2(t) = S(t) + n(0), (18) 
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Figure 20. Spectra of y(t) for o = 9000. 


where, again, 7(7) is zero-mean Gaussian with variance 
o°. Figure 21 shows the power spectra FFT of y(t) 
when o=70. From Figure 21, it is not possible to 
discern the frequencies of the input signal S(t) at 1 and 
4Hz. Thus, the noise has completely overpowered S(7), 
which is now significantly subthreshold. Thus without 
the SR filter, it is not possible to identify the input 
signal. To fairly compare the SR filter to a unity gain 
block in Equation (18), the following SNR figure of 
merit was selected (Gingl, Vajtai, and Kiss 2000): 


SNR = (average power in | Hz and 4Hzsignal)/ 
x (average power in adjacent spectra). (19) 


Thus, as the noise power variance o* increases in 
magnitude, the FFT of y(t) is calculated and the 
average power in the frequencies | Hz and 4Hz are 
compared to the average power in the frequencies 
adjacent to the spectra estimates due to noise only at 
adjacent frequencies (0.75Hz, 1.25Hz, 3.75Hz and 
4.25Hz in Figure 21). As the SNR approaches 1.0, 
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Figure 21. y2 spectra for o= 70. 
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Figure 22. Comparison of SR to unity block. 


it becomes virtually impossible to detect the power in 
the original signal S(t) (cf Figure 21 for the case of 
y2(t)) at the frequencies 1 and 4 Hz. 

To evaluate the SR effect for y(t) computed via 
Equation (1) (Figure 2) versus the unity gain baseline 
(y2(t)) from Equation (18), the SNR variable of 
Equation (19) was determined by computer simulation 
as o increases in value for both systems. Figure 22 
shows this comparison. The y-axis is 10 logo of the 
SNR variable. The top curve is for y(t) as the output 
of the SR filter in Equation (1). The bottom curve is 
for the traditional (unity gain) of y2(t) of Equation 
(18). The x-axis is proportional to o. The difference in 
performance (the gap in Figure 22) exceeds 60 units on 
a 10 logig scale, which is equivalent to a signal-to-noise 
amplification gain of 10° and greater. Thus, the SR 
filter in Equation (1) and Figure 2 amplifies the SNR 
a factor of a million as compared to the tradition 
situation of unity gain. It should be cautioned, 
however, that one caveat of this analysis is a form of 
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Figure 23. Millions of FLOPS versus SNR amplification 
gain. 


a no-free-lunch theorem, which occurs due to compu- 
tational complexity. 


8.2. A no free lunch theorem 


The example just presented seems to infer that SNR 
gains using SR systems (for this elementary example) 
may exceed 10° or higher using the SR effect versus 
not using it. However, there is a computational cost 
for this improvement of SNR that may limit its 
applicability. Figure 23 shows a plot of millions of 
computational floating point computer operations 
per second (FLOPS) versus the SNR gain in decibels 
(10logio(SNR)). It is seen that to obtain SNR 
amplifications of the order of 10° it requires 
2x 10'° FLOPS, which may induce an excessive 
computational burden. Thus, the limits in getting the 
SNR quality of the output signal improved may not 
be practical in certain applications. This is a topic of 
future research. 


9. Summary and conclusions 


SR is powerful effect in nonlinear systems which can, 
for example, significantly amplify the signal-to-noise 
characteristics of a deterministic input signal S(¢). 
The numerical simulation shows significant gains of 
SNR using an SR system compared to a unity gain 
with an accompanying computational cost. The 
nonlinear control theory aspects of this problem 
are examined in this article showing numerous such 
filters can be constructed and designed via their 
nonlinear dynamics and through the manipulation 
of equilibrium points. 
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Appendix 1: Interpreting Figures 1 and 2 in terms of 
statistical optimality 


In Figure 1, assume the resonance curve has as the y-axis the 
probability of detecting S(t) in the high state, when this is 
really true. The x-axis would be the power (variance) of a 
zero-mean Gaussian noise source. From Figure 3, S(t) is 
subliminal (below the threshold) in the top graph and cannot 
be detected. Hence, the SR curve in Figure | must start at the 
origin. As the noise power is increased, the middle diagram in 
Figure 3 now becomes descriptive. Note when S(‘) is in the 
high state, it sometimes pierces through the threshold and is 
correctly detected as being high. This reduces the number of 
misses when the ground truth is that S(t) is high. Thus, the 
SR curve rises from the origin in Figure | as the noise power 
increases. As the noise power continues to grow larger, the 
SR curve reaches a peak. The bottom diagram in Figure 3 is 
now appropriate. The detection probability has increased 
through the gradual reduction of misses of detecting S(¢) 
being in the high state. However, with too much noise power 
added, the number of false alarms (S(t) is detected as high, 
but really it is low) increases to the point that the overall 
detection accuracy starts to decline. Thus, the SR curve 
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reaches a peak in Figure | and then decreases back to zero as 
the noise power continues to increase. The benefits gained by 
reducing the misses in the detector are less helpful due to the 
increases in the number of false positives for large values of 
noise power. Finally, the SR curve approaches zero as the 
noise power becomes large, with little benefit derived from 
the addition of the noise. 


Appendix 2: Validating Equation (14) 
In Figure 15, the equation for y(x) is given by 


y(x) = rx — 2°. 


(A.1) 


To find the local minimum and maximum of y(x), the 
first derivative of y(x) is set to zero. 


d(x) d Gs 
dx : 


Solving for x yields the two stable (symmetric) equilib- 
rium points for the local maximum and minimum: 


r 
Xmin = —Xmax = -/5 r>0 


But the 4 value (from Figure 15) is obtained from 
substituting the result from Equation (A.3) into Equation 
(A.1) for (x) at the equilibrium point, hence: 


or fr 
hr) = rXmax — Gear = zh 


and it can be shown that the negative of Equation (A.4) 
occurs for the locus of negative equilibrium points. 


e)=r—3x7 =0. 


(A.2) 


(A.3) 


(A.4) 
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Abstract—Automated conversational agents, also known as 
“chatbots” or “chatterbots,” are computer programs used in a 
variety of collaborative communications systems, often for 
entertainment or business purposes. However, their use as 
malicious tools has more recently made them a growing 
nuisance and security concern. We present a detailed graphical 
and statistical analysis of communication patterns (specifically 
involving message sizes and inter-message delays) for improving 
the detection of automated conversational agents in 
collaborative computer-mediated communication systems. 


I. 


Conversational agents are automated natural language 
communication programs that interface with humans via real- 
time text-based computer-mediated communications (CMC). 
These programs are also known widely as “chatbots” or 
“chatterbots.” The use of conversational agents (hereafter 
referred to as “chatbots” or simply as “bots”) for 
entertainment, education, and business communications is 
growing in popularity and utility [1,2,3,4]. As the technology 
continues to progress in sophistication, bots will undoubtedly 
find emerging application in many other areas. 


INTRODUCTION 


Unfortunately, chatbots are already finding application in 
the Internet’s underground. They are an increasing nuisance 
for chat, IM, and text-based CMC users, because of phishing 
attempts and through the spread of malware and spam [5,6,7]. 
More maliciously, they are being used as automated 
intelligence agents and scam artists, e.g., by attempting social 
engineering and teasing out personal information from users to 
be used for identity theft [8,9]. Bots can pose considerable 
security risks considering that they may be dispatched in 
virtually unlimited numbers and can work tirelessly at their 
goals, in contrast to individual humans attempting such feats 
in a person-to-person manner. To combat this threat, reliable 
methods for detecting bots are needed, so that malicious 
automated agents can be identified and removed from CMC 
systems. 
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Some previous and current attempts to stop chatbots from 
entering public chat rooms utilize CAPTCHAs (Automated 
Turing Tests) so that only humans can enter [e.g., 10], but this 
defense can be bypassed with human assistance. Keyword- 
based filtering and related spam detection methods are 
common but seem to be having limited success with chatbots. 
Gianvecchio et al. [11] suggested using metrics based upon 
gross communication patterns, such as communicators’ 
message sizes and inter-message delay times. Using a large 
dataset of public chat transcripts, they provided extensive 
evidence that chatbots could be classified into as many as 14 
unique behavioral types based upon these metrics. And these 
metrics, when coupled with pattern-matching and/or machine- 
learning algorithms, could be extremely effective at 
distinguishing between humans and bots in computer- 
mediated communications. However, the chat data they 
analyzed was limited to public chat rooms with many multiple 
simultaneous communicators, and was primarily focused on 
bots that were attempting to get users to click on hyperlinks 
(either posted in messages or in their online profiles). It was 
unclear from their work whether similar chatbot behavioral 
patterns would be present in other types of CMC, such as one- 
on-one instant messaging, or when bots are attempting to carry 
on lengthy, convincing conversations with humans. 


We attempted to study such questions in a previous work 
[12], using a different (and considerably smaller) dataset. We 
provided supporting evidence that the gross behavioral metrics 
suggested by Gianvecchio et al. (message sizes, inter-message 
delays) could be useful for passively distinguishing between 
humans and bots. In this work, we present a more detailed 
graphical and statistical analysis of related measures using a 
similar but larger data set. 


Il. METHOD 


A. Chat Transcript Data Set 


As in our previous work [12], we gathered our chat data 
from the publicly-available transcripts of the Loebner Prize in 
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Artificial Intelligence [13]. The Loebner Prize is a formal 
public Turing Test competition in which the world’s best 
chatbot programs compete to be the most “human” in terms of 
conversational capabilities. The Turing Test was suggested by 
British mathematician Alan Turing as a practical method for 
gauging the “intelligence” of an  artificially-intelligent 
conversational agent [14]. The Test requires human judges to 
determine through text-based conversations whether their 
conversing partner is a human confederate or a computer 
program. 


One of the great advantages of the Loebner Prize dataset is 
that the communicators are all unambiguously defined as 
either humans or chatbots; another is that an exceedingly large 
proportion of the conversations specifically involve chatbots, 
whereas other public chat datasets do not typically possess 
such high frequency of bot communications. Yet another 
advantage is that the bots are specifically trying to carry on 
human-like conversations with people for lengthy periods, 
whereas in many public data sets bots are mostly trying to 
convince users to click hyperlinks posted in their messages or 
in their online profiles [11]. 


A potential disadvantage of this dataset is that the 
conversational situation might not be considered “typical,” in 
that most of the human judges are suspicious as to the other 
communicators’ identities from the start, and many of the 
conversations might be rightly classified as “interrogations” as 
opposed to merely “normal, everyday conversations.” 
Nonetheless, since both the human participants (confederates) 
and bots faced similar interrogations by the human judges, we 
do not believe this to be a fatal flaw to our analyses and 
interpretations; indeed, in our previous work [12], there was 
little difference between the two human groups (judges and 
confederates) on the inter-message delay and message size 
distribution measures. In any case, the nature of our dataset is 
certainly a caveat to be kept in mind when interpreting our 
data and drawing conclusions. 


Although in our previous work [12] we used data only 
from the 2008 competition, in this work we analyzed five 
separate competitions, from the years 1996, 1997, 2004, 2005, 
and again from 2008. These particular years were chosen for 
analysis -- instead of using all 15 years of available data -- for 
several reasons: these years’ data included text files, which 
allowed for ease of analysis; and these years included all the 
necessary and desired information for analysis of the 
transcripts, i.e., some years did not include clear delineation of 
individual communicator identities, time-stamping, structuring 
of conversations, etc. 


This dataset captured a total of 9,206 messages in 254 
brief conversations involving over 50 individual humans and 
22 chatbots. It should be noted that some bots (and humans) 
were apparently repeat participants across two or more 
competition years. 


B. Data Analysis Methods 


From the raw transcript data, we calculated three primary 
metrics of interest: inter-message delay times (the time 
between sequential message transmissions), message size (the 
number of words transmitted per message), and message rate 


(the message size divided by the inter-message delay, in words 
per minute or wpm; note that this measure is related but 
distinct from typing speeds which are also typically measured 
in words per minute). From these metrics, we analyzed 
features of the human and bot distributions, including central 
tendencies, dispersions, skewness, kurtosis, complexity, and 
visual analysis of distributions. Other statistical tests (1.e., 
ANOVA) are performed where appropriate, using alpha=.05. 


Messaging complexity was measured via two methods: 
normalized entropy and normalized entropy rate. Entropy was 
calculated using Shannon’s information entropy approach 
[15], and can be interpreted in this context as the complexity 
of messaging behaviors. Entropy was measured by first 
calculating the empirical probability, p(x), of occurrence of 
each inter-message delay, message size, and message rate. 
The probabilities for each category were then used in the 
Shannon entropy equation: 


A(x) = -L (p(x) *log(p(x))) 


Entropy rate can be interpreted as the entropy of sequential 
pairs of observations. For our purposes, high entropy rates 
would suggest a highly complex messaging process. Similar to 
above, the empirical probability, p(x,), was calculated for each 
sequential pair of observations (i,j) in the inter-message 
delays, message sizes, and message rates. It is similar in form 
and calculation to the entropy calculation: 


A(x) = -L(pXip *log(pxi) 


Due to the fact that these measures are sensitive to sample 
size, we normalized our calculations of both the entropies and 
the entropy rates (giving values from 0 to 100%). 
Normalization was accomplished by taking the observed 
entropy (or entropy rate) divided by the maximum possible 
entropy (or entropy rate) given the sample size, n: 


Normalized H(x;) = ->.\(p(x) *log(p(x)) / log(n)) 


These normalized entropy and entropy rates provided 
measures of complexity that could more be easily compared 
across the varying the sample sizes used in these analyses. 


Ill. RESULTS AND DISCUSSION 


A. Inter-Message Delay Times 


Overall results concerning inter-message delay times are 
presented in Table I; further results are presented and 
discussed in the text and associated figures. 


TABLE I. INTER-MESSAGE DELAYS 

Measures Alumans Bots 
Mean (arithmetic) 20.4 73 
Median 17.5 4.0 
Standard Deviation 13.6 9.2 
Skewness 0.793 2.183 
Kurtosis -0.074 5.800 
Sample size (n ) 5424 3527 
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Figure 1. Inter-message delay relative frequency distributions for humans 


and chatbots. Inter-message delays were measured in seconds. 


1) Central Tendencies and Dispersions: Average human 
inter-message delay times were substantially longer in 
comparison to bots (20.4 versus 7.3 seconds, respectively). 
Median values produced a similar pattern (17.5 versus 4.0 
seconds, respectively). Variability for the human inter- 
message delays was also noticeably larger (standard 
deviations of 13.6 versus 9.2 seconds). 


2) Distribution Shapes: Measures of distribution shapes 
were also noticeably different between humans and bots. 
Human inter-message delay distribution skewness was 
measured at 0.793 (with a standard error of skewness of 
0.033), while the bot distribution skewness was measured at 
2.183 (with a standard error of skewness of 0.041). Human 
inter-message delay distribution kurtosis was measured at - 
0.074 (with a standard error of kurtosis of 0.066). Bot 
distribution kurtosis was measured at 5.800 (with a standard 
error of kurtosis of 0.082). 


3) Graphical Analysis: As can be seen in Fig. 1, human 
inter-message delays follow a distinctly different distribution 
pattern than bots. To allow a more straightforward comparison 
to the results of Gianvecchio et al. [11], we also plotted the 
inter-message delay distributions on log-log scales in Fig. 2. 
The advantage of this plotting method is that a power law 
distribution (which we would expect to find for the humans) 
tends to reveal itself as a linear pattern when plotted on log- 
log scales. Just as Gianvecchio et al. found, the bulk of the 
human data points (inter-message delays of 10 s or greater 
comprise almost 75% of the distribution) fall nearly on a 
straight line in this log-log plot; the same is not quite true for 
the bots. 


4) Discussion: Inter-message delay times seemed to offer a 
variety of distinguishing measures for detecting bot versus 
human chat communications. At least in our dataset, 
potentially useful metrics involving inter-message delays 
included: measures of central tendency, dispersion or 
variability, skewness, and kurtosis. These results directly 
support our previous work in [12], the results of Gianvecchio 
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Figure 2. The same data as in Fig. 1, but plotted on logarithmic X and Y 
axes. The bulk of the human data points (almost 75%) fall nearly on a 
straight line (the dashed grey line). 


et al. [11], and other human chat data [16] demonstrating 
temporal patterns in human communication that can be used to 
distinguish them from chatbots. Further, these results 
indirectly support other research showing that humans exhibit 
distinct temporal patterns in virtually all forms of their 
communications, including asynchronous CMC such as 
blogging, texting, cyber forums, e-mailing, and even non- 
CMC including surveys and spoken communication [17]. 


B. Message Sizes 


Overall results concerning message sizes are presented in 
Table II; further results are presented and discussed in the text 
and associated figures. 


1) Central Tendencies and Dispersions: Average human 
message sizes were only slighter smaller than bots (8.4 versus 
9.3 word count per message, respectively). Median values 
produced a similar pattern (6.0 versus 7.0, respectively). 
Variability, too, was very similar between humans and bots 
(standard deviations of 7.8 versus 8.6 words per message). 


2) Distribution Shapes: Measures of distribution shapes 
were somewhat mixed on whether there was a noticeable 
difference between humans and bots. Human message size 
distribution skewness was measured at 3.077 (with a standard 
error of skewness of 0.033), while the bot distribution 
skewness was measured at 2.603 (with a standard error of 
skewness of 0.041), suggesting little difference on this 
measure. However, human message size distribution kurtosis 
was measured at 17.213 (with a standard error of kurtosis of 
0.066) while bot distribution kurtosis was measured at 12.464 
(with a standard error of kurtosis of 0.081), which may 
suggest a possible distinguishing metric. 


3) Graphical Analysis: As can be seen in Fig. 3, human 
and bot message size distributions are highly similar and 
practically indistinguishable. 
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TABLE II. MESSAGE SIZES 

Measures Humans Bots 
Mean (arithmetic) 8.4 9.3 
Median 6.0 7.0 
Standard Deviation 7.8 8.6 
Skewness 3.077 2.603 
Kurtosis 17.213 12.464 
Sample size (n ) 5586 3620 
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Figure 3. Message size relative frequency distributions for humans and 


chatbots. Message sizes were measured in word count per message. 


4) Discussion: Analysis of message sizes offered very 
little in terms of distinguishing between humans and bots. This 
result is in contrast to our previous findings (with a smaller 
sample) that suggested message sizes to possibly be a 
distinguishing metric [12]. This result is also somewhat in 
conflict with the results of Gianvecchio et al [11] who found 
many “types” of bots whose message size distribution shapes 
differed widely from the observed human distributions. Even 
though in our sample there was a statistically significant 
difference between human and bot message sizes (F=34.95, 
p<.001), the statistical effect size was near zero (partial- 
eta’=.004). Indeed, visual inspection of the relative frequency 
distributions of human versus bot message sizes shows 
extensive overlap and similarity of distribution shapes 
between the groups, and both distributions appear to follow an 
exponential decay (after excluding the initial spikes). 


C. Message rates 


Overall results concerning message rates are presented in 
Table HI; further results are presented and discussed in the 
text and associated figures. 


1) Central Tendencies and Dispersions: Average human 
message rates were measured at a reasonable 35.7 words per 
minute (median of 24.0 wpm), while average bot message 
rates were a blistering 145.4 words per minute (median of 72.0 
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TABLE III. MESSAGE RATES 

Measures Humans Bots 
Mean (arithmetic) 35.7 145.4 
Median 24.0 72.0 
Standard Deviation 67.4 233.4 
Skewness 10.36 4.72 
Kurtosis 151.46 31.90 
Sample size (n ) 5364 2805 
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Figure 4. Message rate relative frequency distributions for humans and 
chatbots. Message rates were measured in words per minute (wpm). 


wpm). Variability, too, was widely different between the two 
groups. The standard deviation of human message rates was 
surprisingly large at 67.4 wpm, but bots were even larger at 
233.4 wpm. 


2) Distribution Shapes: Distribution shape measurements 
regarding message rates suggest large differences between the 
groups. Human distribution skewness was measured at 10.358 
(standard error of skewness was 0.033), while bot skewness 
was 4.720 (standard error of skewness was 0.046). Human 
distribution kurtosis was 151.460 wpm (standard error of 
kurtosis was 0.067) versus bot kurtosis of 31.901 (standard 
error of 0.092). 


3) Graphical Analysis: As can be seen in Fig. 4, human 
and bot message rate distributions are distinctly different. 
Almost all of the observed human message rates fall below 70 
wpm, with a large spike peaking at about 20 wpm. In contrast, 
bot message rates are scattered with a flat, wide, and 
irregularly-shaped distribution. All observed message rates are 
shown in Fig. 5, and median message rates for individual 
humans and bots is plotted in Fig. 6. The differences between 
humans and chatbots in regards to message rates are clearly 
evident in both of these figures. 


4) Discussion: Message rate, a combined measure of inter- 
message delays and message sizes, seems to offer yet another 
way to distinguish between human and_ chatbot 
communications. Useful message rate metrics found in our 
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Figure 5. All observed message rates in words per minute (wpm). Notice 
the y-axis is a logarithmic scale to accommodate extreme observations. Each 
mark is a single observation for humans or chatbots. The thick dotted 
horizontal line indicates a cut-off message rate that captures 90.4% of the 
humans and excludes 60.4% of the bots (between 0 and 70 wpm, inclusive). 


data include means, standard deviations, skewness, and 
kurtosis. Had we attempted bot detection in our dataset by 
using a simple rule that included only median message rates of 
individual communicators that fell within the typical human 
range (5 to 50 wpm), we would have been able to quite easily 
detect and remove 78% of the bots while excluding no 
humans. Another simple rule that excluded individual 
messages with rates above 70 wpm would have included 
90.4% of the human messages, while excluding 60.4% of the 
bot messages. 


D. Messaging Complexity 


1) Entropy: Analysis of the normalized messaging 
entropies revealed some differences between human and bot 
communication entropy. Entropies were calculated per 
individual communicator. Results are shown in Table IV. As 
would be expected if humans demonstrated more complexity 
in their communications than bots, humans showed higher 
entropies on all three measures. The individual human 
entropies were 1.7 times higher than bots on the inter-message 
delays (Fig. 7), 1.4 times higher on message sizes (Fig. 9), and 
1.3 times higher on message rates (Fig. 11). Here, the pattern 
of results is the same as those of Gianvecchio et al. [11], who 
found the messaging complexities (as measured by the 
corrected conditional entropies) to be higher for humans in 
both inter-message delays and message sizes (they did not 
analyze message rates). Visual analysis of these distributions 
(Fig. 7, 9, and 11, top row, next page) offers a clear distinction 
between humans and bots, particularly on the inter-message 
delay and message rate entropies. 
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Figure 6. Median message rates per individual, in words per minute (wpm). 
Notice the y-axis is a logarithmic scale to accommodate extreme values. 
Each mark is the median of observations for an individual human or chatbot. 
The shaded grey region indicates the median message rates that include 
100% of the humans and exclude 78% of the bots (between 5 and 50 wpm). 


2) Entropy Rate: Analysis of the normalized messaging 
entropy rates also revealed large differences between human 
and bot communications. Averages of the individual entropy 
rate results are shown in Table V. Humans on average 
exhibited substantially higher entropy rates (i.e., complexity) 
on all three messaging behaviors. The human inter-message 
delay entropy rates were on average 2.4 times higher than bots 
(Fig. 8), message size were 1.6 times higher (Fig. 10), and 
message rates 1.3 times higher (Fig. 12). These results also 
support the findings of Gianvecchio et al. [11], who also 
examined the entropy rates of inter-message delays and 
message sizes. Visual analysis of the distributions (Fig. 8, 10, 
and 12, lower row, next page) again shows clear distinctions 
between humans and bots. 


TABLE IV. MEAN INDIVIDUAL NORMALIZED ENTROPY 


Normalized Entropy (%) Humans Chatbots 
Inter-Message Delay 80.1 46.3 
Message Size 76.9 56.9 
Message Rate 82.8 64.6 


TABLE V. MEAN INDIVIDUAL NORMALIZED ENTROPY RATE 


Normalized Entropy Rate (%) Humans Chatbots 
Inter-Message Delay 58.0 18.4 
Message Size 46.3 19.3 
Message Rate 61.9 35.1 
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(normalized). Each mark is the observed entropy 
of an individual human or chatbot. 
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Individual message size entropy 
(normalized). Each mark is the observed entropy 
of an individual human or chatbot. 
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Figure 11. Individual message rate entropy 
(normalized). Each mark is the observed entropy 
of an individual human or chatbot. 
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Figure 8. Individual inter-message delay entropy 
rates (normalized). Each mark is the observed 
entropy rate of an individual human or chatbot. 


IV. CONCLUSIONS AND FUTURE WORK 


Despite the nature of our dataset in comparison to others 
[e.g., 11], our results generally support the utility of chat 
communication metrics for distinguishing between human and 
bot communications. Specifically, we found that measures of 
central tendency, dispersion, distribution shapes, and related 
patterns of inter-message delays were very distinct between 
humans and bots. Further, we found that message rates, a 
derived measure from both inter-message delays and message 
sizes, could also be of some utility. Our entropy and entropy 
rate measurements suggested that individual bots generally 
exhibit lower complexity than individual humans in their chat 
communications, and this was true for inter-message delays, 
message sizes, and message rates. However, the statistical and 
distributional analysis of message sizes seemed to offer few 
metrics that might aid in the identification of bot 
communications. 


Future research into differentiating metrics for humans and 
bots might include the analysis of keystroke dynamics, 


Figure 10. Individual message size entropy rates 
(normalized). Each mark is the observed entropy 
rate of an individual human or chatbot. 
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Figure 12. Individual message rate entropy rates 
(normalized). Each mark is the observed entropy 
rate of an individual human or chatbot. 


alternative measures of complexity, and other graphical and/or 
statistical patterns of chat, including word choice/phraseology, 
typographical error distributions, detailed chronemic analysis, 
conversational dynamics, etc. We plan on developing 
visualization tools for chatbot detection that specifically focus 
on some of the metrics discussed in this work. We are also 
considering the development of a set of “humanness” metrics 
for rating the conversational capabilities of chatbots, which 
may also be advantageously visualized for chat users, chat 
room administrators, and AI developers. 


Chatbots will conceivably continue to progress in 
sophistication until their conversational capabilities are 
virtually indistinguishable from humans. In the future, we will 
find these automated conversational agents in a variety of 
unforeseen applications, in addition to their current growing 
applications in commerce, education, and entertainment. 
However, malicious chatbots can pose considerable security 
risks since they can be unleashed in virtually unlimited 
numbers and can work tirelessly on gathering intelligence, 
spreading malware and spam, and generally hampering 
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effective collaborations and communications. As_ bots 
approach the communication level of an average human, their 
capability to be used as malevolent tools will also grow. We 
hope the analysis and discussion presented herein provides 
improved capabilities to detect and remove malicious 
automated conversational agents from collaborative computer- 
mediated communication systems, so that such systems can be 
used efficiently, effectively, and securely for their intended 
purposes. 


ACKNOWLEDGMENTS 
We wish to thank David Dommett at the Air Force 
Research Laboratory’s 711" Human Performance 


Wing/RHCVZ, for his assistance with the entropy and entropy 
rate calculations. We also wish to thank Dr. Xiaoping Annie 
Shen, associate professor in the Department of Mathematics at 
the Ohio University, for her many helpful comments and 
critiques on the mathematical portions of this paper. 


REFERENCES 


S. Quarteroni and S. Manandhar, “A chatbot-based interactive question 
answering system,” Proc. 11" Workshop on the Semantics and 
Pragmatics of Dialogue, Trento, Italy, pp. 83-90, May 2007. 


[1] 


[2] M. Leaverton, “Recruiting the chatterbots: How virtual agents make the 
web more human,” CNET web article, Oct 2000 (accessed May 2010), 


available: http://www.cnet.com/4520-6022_1-102077-1.html. 


A. Kerly, P. Hall, and S. Bull, “Bringing chatbots into education: 
Towards natural language negotiation of open learner models,” 
Knowledge-Based Systems, vol. 20, pp. 177-185, March 2007. 


S. Kowalski, “Two case studies in using chatbots for security training,” 
Proc. 9" IFIP World Conference on Computers in Education, 2009. 


S. Thakur, “AOL no more chat room spam petition,” Petition Online, 
online public web petition (accessed May 2010), available: 
http://www. petitiononline.com/chatspam/petition.html. 


J. Hu, “AOL: spam and chat don’t mix,” CNET News web article, July 


2003 (accessed May 2010), available: http://news.cnet.com/AOL- 
Spam-and-chat-dont-mix/2100-1032_3-1024010.html. 


[3] 


[4] 
[5] 


[6] 


21 


7) 


8] 


9] 


10] 


1] 


[16] 


[17] 


B. Krebs, “Yahoo! Messenger network overrun by bots,” Washington 
Post web article, Aug 2007 (accessed May 2010), available: 
http://blog. washingtonpost.com/securityfix/2007/08/yahoo_messenger_ 
network_overru.html. 


P. Naughton, “Flirty chat-room ‘bot’ out to steal your identity,” 
FOXNews web article, Dec 2007 (accessed May 2010), available: 
http://www.foxnews.com/story/0,2933,3 16473,00.html. 


I. Fried, “Warning sounded over ‘flirting robots’,’ Beyond Binary, 
CNET web article, Dec 2007 (accessed May 2010), available: 
http://news.cnet.com/8301-13860_3-9831133-56.html. 


A. Mohta, “Yahoo chat: Captcha check to remove bots,” TechnoSpot 
Network web article, Sept 2007 (accessed May 2010), available: 
http://www.technospot.net/blogs/yahoo-chat-captcha-check-to-remove- 
bots/. 


S. Gianvecchio, M. Xie, Z. Wu, and H. Wang, “Measurement and 
classification of humans and bots in Internet chat,” Proc. 17° USENIX 
Security Symp., San Jose, CA, July 2008. 


J. McIntire, L. McIntire, and P. Havig, “Methods for chatbot detection 
in distributed text-based communications,” Proc. International 
Collaborative Technologies and Systems Symp., Chicago, IL, May 
2010. 


The Loebner Prize in Artificial Intelligence: The First Turing Test, 
annual competition website (accessed May 2010), available: 
http://www. loebner.net/Prizef/loebner-prize.html. 


A. Turing, “Computing machinery and intelligence,” in Readings in 
Cognitive Science: A Perspective from Psychology and Artificial 
Intelligence, A. Collins and E.E. Smith, Eds. San Mateo, CA: Morgan 
Kaufmann Publishers, Inc., 1988. 


C. Shannon, “A mathematical theory of communication,” Bell System 
Technical Journal, vol. 27, pp.379-423, 623-656, July/Oct 1948. 


A. de Siqueira and S. Herring, “Temporal patterns in student-advisor 
instant messaging exchanges: Individual variation and accomodation,” 
Proc. 42°¢ Hawai’I International Conf. on Systems Sciences, Los 
Alamitos, CA: IEEE Press, 2009.  Pre-print available: 
http://ella.slis.indiana.edu/ ~herring/desiqueira.herring.2009.pdf. 

Y. Kalman, G. Ravid, D. Raban, and S. Rafaeli, “Pauses and response 
latencies: A chronemic analysis of asynchronous CMC,” J. Computer- 
Mediated Communication, vol. 12, issue 1, 2006. Available: 
http://jemc.indiana.edu/vol1 2/issue1/kalman.html. 


Distribution A: Approved for public release; distribution unlimited. 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


Quantify Effects of Long Range Memory on 
Predictability of Complex Systems 


Xiaoping Shen!, Katheryn A. Farris? and Paul R. Havig? 
‘Department of Mathematics, Ohio University, Athens, OH 45701, USA 
2711 HPW AFRL/RHCV, Wright-Patterson Air Force Base, Dayton, Ohio 45433, USA 


Abstract— This paper explores the connection between uncer- 
tainty and memory effects of time series associated with complex 
system. Traditionally, information theory based algorithms, such 
as Shannon entropy and its relatives, are employed as measure- 
ments to describe uncertainty quantitatively. This study brings 
into focus the important role of the long range memory effects 
on the uncertainty measurements. The method is applicable to 
arbitrary complex systems. Financial data are investigated as 
an example. The approach provides important insights into the 
predictability of a complex system. 


I. INTRODUCTION 


Uncertainty is pervasive in complex system. Quantify un- 
certainty and how it propagates through the system is major 
challenge in complex system modeling and calibration. Many 
information theory based algorithms, such as Shannon entropy 
and its relatives, are employed as measurements to describe 
uncertainty quantitatively. Some parameters or indicators are 
used as long range memory measurements. Hurst and Holder 
exponents are the most important and popular parameters. 
Traditionally, these parameters are studies independently. On 
the contrary, the study on the relations among these parameters 
is not commonly found in literature. This study brings into 
focus the important role of the long range memory effects on 
the uncertainty measurements. More specifically, it intended 
to reveal the relation between two important parameters, the 
Hurst exponent (measurement of global long range memory) 
and the Entropy (measurement of uncertainty or predictabil- 
ity). Fraction Browning Motion is investigated in numerical 
experiments. The approach provides a new view to understand 
the uncertainty and predictability of time series from complex 
systems. 

The paper is organized as follows, after the brief introduc- 
tion, this section is concluded by providing some necessary 
background in time series analysis to make the paper self- 
contained. Section 2 discusses the statistical parameters in 
modeling non stationary time series. Section 3 is devoted to 
the discussion of numerical simulation, and the last section 
provides a summery of the empirical results and future study. 


II. BACKGROUND 


We recall brie y the mathematical and statistical definitions 
and properties of the Hurst exponents, ApEn and related 
concepts in this section. General references on self-similar 
processes and long-memory processes are given in [21] and 


[6]. 
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A. The Hurst exponent 


An important indicator of complex systems is to be able 
adapt to inputs and evolve. Adaptation and evolution are 
characteristic of critical infrastructure systems. Self-similarity 
and chaos are examples are used to describe a complex 
system qualitatively. One example of complex systems is 
a the long-memory time series which was brought to the 
attention by Hurst [16]. It has been of great interests in the 
research community ever since. Hurst exponent (or parameter) 
H is named by Mandelbrot in honor of both Harold Edwin 
Hurst and Ludwig Otto Holder. Since the pioneering work 
of Mandelbrot and Ness[21], the fractional Brownian motion 
(fBm) has become widely popular in a theoretical context as 
well as in the practice of modeling self-similar phenomena. 

H: Hurst self-similarity parameter, arises from the general- 
ization of the Brownian motion from integer to fractional. It 
was interlaced to the applied statistics community by Mandel- 
brot and van Ness [21] in honor Hurst (the original notation 
used by Hurst was K) and Holder. 

To begin, we define the fractional Brownian motion (fBm) 
and fractional Gaussian noise (fGn). 

Let (0, F, P) be a complete probability space on which all 
stochastic processes are defined. 

Definition 1: (Fractional Brownian motion). Fractional 
Brownian motion (fBm) is defined by its stochastic represen- 


tation 
t 
a 


where IT is the Gamma function defined by [I (a) 
Har t?—le~*dt and B is a stochastic process, regular Brownian 
motion defined on (Q, F, P). H is called the Hurst parameter 
or exponent of By. 

Remark 1: The stochastic representation (1) is not unique. 
The integral can be understood as a Lebesgue-Stieltjes inte- 
gral. Notice that when H=3, B,/2(t) = B(t) is a ordinary 
Brownian motion. 

A fractional Brownian motion By = {By(t),0<t < co} 
with Hurst exponent H, 0 < H < 1 is uniquely characterized 
by the following properties: 

1) By(0) = 0; 

2) By/(t) is a zero-mean Gaussian process with continuous 

sample paths and stationary increments; and 


Bull) = aay (t—7r)¥-2aB(t) (1) 
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3) The autocovariance function is given by 
1 
pu(8,t) = EBn(s)Bu(t) = s{8°" +P" —(t—-s)"}, 


4) E[B3,(t)] = Vat?” , where 


—I'(2 — 2H) cos(7H) 


Vi = Var(By) = TH(QH — 1) 


A close related parameter is, 

d: fractional integration parameter, arises from the gen- 
eralization of the Box-Jenkins ARIMA(p, d,q) models from 
integer to non integer. It was introduced by Granger and 
Joyeux [15](1980) and Hosking [17] (1981), independently. 

Definition 2: (Fractional Gaussian noise). The incremental 
process X={X;, : k = 0,1,...} of fBm, X, = By(k +1) - 
By(k) is called Fractional Gaussian noise. 

X; has following properties, 


1) X, has a standard normal distribution; 
2) The autocovariance function 


1 
yk) = sll — 1% — 2|&)P + e+ 1°) 
~ H(2H —1)k*#-? 


Two parameters induced from H and d are: 

3: power law exponent (to be definied in next section) 

For fraction Brownian motion, @ = 2H + 1; 

For fractional Gaussian noise, 8 = 2H — 1. 

Many algorithms have been developed to compute Hurst 
exponents. The numerical experiments in this project are 
based on the built in wavelet based programs in Matlab. The 
following figures illustrate the property of the Hurst exponent. 
From Figure 1 to Figure 3, the time series are generated 
with prescribed Hurst exponent, H=0.2 (negative memory), 
0.5 (regular Brownian motion, no memory), and 0.9 (long 
range memory). The Hurst exponent is an indicator of the 
smoothness of the time series. 


fractional Srown'an motion - parameter. 0.2 


logfabsifhm}} 


Fig. 1. Fractional Brownian motion — H = 0.2. 


fractional Brownian motion - parameter: 0.5 
‘ r T T 


logfabsifbrn}) 


Fig. 2. Fractional Brownian motion — H = 0.5. 


fractional Brownian motion - parameter: 0.5 


logtabs(fbm)} 


Fig. 3. Fractional Brownian motion — H = 0.9. 


B. Entropy and its relatives 


Shannon entropy [28]in its basic form is a measure of uncer- 
tainty rather than a measure of information. Mathematically, 
it can be defined as, 

Definition 3: The entropy of a discrete random variable X 
is defined by 


H(X) = — ST pla) log p(x) 2) 
tex 
Similarly, entropy for continuous random variable can be 


defined as, 
Entropy is always positive. Entropy measures the uncer- 
tainty inherent in the distribution of a random variable. 


II]. NUMERICAL SIMULATIONS 


A. Data description and empirical results. 


We first simulated the fBm sample data using a wavelet 
based algorithm. The Hurst exponents are set from 0.025 to 
0.975 with a step size of 0.025, total 39 samples. Each sample 
contains 2'° = 1024 sample points. These data are stored in 
a matrix with size 1024 by 39. 


Distribution A: Approved for public release; distribution unlimited. 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


Shannon Entropy 


Fig. 4. Relation between Shannon entropy and Hurst exponent 


1) Relation between Shannon Entropy and Hurst exponent: 
We then run Monte Carlo simulation for the Shannon entropies 
of each fBm random variable. Figure 4 shows the experiment 
results for the relation between Shannon entropy and Hurst 
exponents. 


The regression results for the relation between Shannon 
Entropy and Hurst exponent are shown in Table 1. 


Table 1. Relation between entropy and Hurst exponent 


Ly Error 
0.44116 
0.25608 
0.18002 


Relation 
Linear = —1.89 H +2.16 
Quadratic 


y = - 1.89*x + 2.16 
[Cee 
eee, 


eee 
Cee 
Cees 
Cece, J 
Cees 
idee LY YY 


0.2 0.4 0.6 0.8 1 


residuals 


r T T T T 
Linear: norm of residuals = 0.44116 


Fig. 5. Linear relatin between entropy and Hurst parameter. 


These initial studies revealed two parameters, entropy and 
Hurst exponent are not independent. 

2) Relation of Approximate Entropy and Hurst exponent: 
In his seminal work, S.M. Pincus, introduced the concept 
and algorithm of approximate entropy (ApEn) [26]. ApEn is 
a statistics that quantifies the randomness and quantify the 
degree of predictability in a time series. For instance, a low 


> 
rom 
g y = 0.813%x2 - 2.7% + 2.3 
© 2 Ceecce, 
= ®eee., 
[e} &e 
E11} eee 
= Peeece 
fe Peeecccec, 
. 0 
0 0.2 0.4 0.6 0.8 1 


residuals 


a Quadratic: norm of residuals = 0.25608 


0.1} 


Fig. 6. Quadratic relation between entropy and Hurst parameter. 


wo 


y = 1.673 - 1.7%? - 1.69" + 2.21 
|@ee 
Pete, 


N 


R 
T 


Shannon Entropy 


0.6 0.8 


residuals 


T T T T T 
Cubic: norm of residuals = 0.18002 


Fig. 7. Cubic relation between entropy and Hurst parameter. 


ApEn indicates the time series has many repeated patterns or 
less randomness. ApEn can defined as, 


Definition 4: Assume X,, X2,...,Xy are a sequence of in- 
dependent and identically distributed random variables (i.i.d). 
Each random variable takes values in {1,...,N}. Denote 
Y(i,m) = [X(i), X(@i+1),...,X(i+m-—1)] where i = 1, 
., N-m+1, 

Cri) =yotm#i:1<i<N-m 


and 
N-m+1 


oe Nome 2, In(Cy" (7) 
The ApEn value for a finite length N is, 
ApEn(m,r, N) = y™(r) — 6") (r). 


Notice that r is a parameter which measures the similarity 
of two pieces of a given time series, in particular, r = 0, 
Y(j,m) and Y(i,m) have identical patterns. The uniqueness 
of this algorithm is that it distinguishes a wide variety of 
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Approximate Entropy 


rparameter 0.15 


o1 | 1 71 
1 09 os 07 06 os 04 03 02 1 0 
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Fig. 8. Relation of the parameter r of ApEn and Hurst exponent H. 


systems. In particular, for small m, estimation of ApEn(m,r) 
by ApEn(m,r,N) can be achieved with relatively few data 
points. ApEn assigns a non-negative number to a time-series 
with larger values corresponding to more irregularity in the 
data. In short, ApEn measures the logarithmic likelihood that 
runs of patterns which are within the tolerance window (r) for 
m. contiguous operations that remain close to that tolerance 
window on subsequent incremental comparisons [26]. We 
employed the numerical algorithm developed in [27] (the 
related Matlab program is available at the Matlab Central [24]) 

Using Monte Carlo simulation, for r = 0.1 : 0.01 : 0.2, 
the relation ApEn(m,r, NV) and the Hurst exponent is show in 
Figure 8. 


IV. CONCLUSION AND FUTURE STUDY 


To quantify uncertainty or study the predictability of com- 
plex system, we attempt to fit the associated time series to 
models of non stationary process with long memory. The fBm 
models always play a important role in the non stationary 
process modeling. In the paper, we report the initial results on 
calibration fBm and the relations between Shannon entropy, 
approximate entropy and the Hurst exponent of for fBms. 
These results will pave a way to build a semi-parametric (based 
on fBm) multiscale model for quantify uncertainty of complex 
systems. 
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Abstract. To discern similarity and differences in partial 
DNA strings based on dissimilarity (distance/difference) among 
the various SNPs, one of the challenge aspects is to select 
felicitous metrics or measurements. Some of information 
theoretic quantities are often employed in _ practice. 
Unfortunately, certain information-theoretic variables for 
example, information distance and mutual information, may 
not yield consistent results for decision-making. In this paper, 
we investigate the consistency of information theoretic 
quantities Experiments are designed to show that the selection 
of measures and metrics in information-theoretic based analysis 
is crucial for decision-making. Future possible research 
directions are discussed. 


Keywords: Distance metric, information theory, DNA, SNPs 


1. Problem Specification. 

SNPs (single nucleotide polymorphisms) are DNA sequence 
variation that occurs when a single nucleotide in the genome 
differs. SNP arrays are a type of DNA microarray that detect 
SNP occurrences and act as samples of DNA strings that can 
be extracted from microchips (hardware) and other devices 
that come in contact with the DNA of living organisms. 
These SNP arrays do not represent a complete DNA string, 
which, e.g. for a human, would consist of about 3.2 x 10° 
base pairs of the human chromosome. A typical SNP arrays 
would represent a fragment of this string with a length of, 
perhaps, up to 500,000 base pairs. Each base pair of the 
human DNA may be in one of four states (A, C, T, or G). 
The goal is to correctly identify genetic sequences of 
different individuals to help classify chromosomal regions 
where genetic variants are shared. For crops and animals, 
the study of SNPs is important in fertilization and breeding. 
For human DNA, the extracted SNPs may define how people 
contract diseases and respond to certain treatments, drugs, 
vaccines, chemicals, pathogens and other agents. 
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SNPs may be great enablers in developing personalized 
medicine, but there is some controversy of how this 
information may be possibly abused. Some basic definitions 
are appropriate in this work: 


Definition 1: Phenotype — Is a measure of a trait/skill of an 
individual. 

Definition 2: Genotype — The information carried by the 
genes. 

Definition 3: Homozygous — The chromosomes are identical 
in every state. 

Definition 4: Heterozygous — There exists a SNP between 
two chromosomes. 


It is noted in Definition 4, that a SNP is not a 
weighted difference, in the sense that no distinction has been 
made between the states, e.g. A and G as being further apart 
from A and C. Future work may weigh different pair 
combinations as having distances between SNPs that are 
predicated on which base pairs are involved. For example, 
in Figure (1) three DNA strings are shown which are 
constructed, for simplicity, from a hypothesized 8 base pair 
fragment of DNA. For simplicity, the notation will be used 
that A=1, C=2, T=3, and G = 4, for the cells (alleles) 
although they are categorical variables. It is seen that DNA, 
and DNA, differ from each other by only one base pair. 
However, DNA, and DNA; differ by four base pairs. In 
some similarity sense using a distance/difference metric then 
DNA, is closer to DNA, and DNA, is further apart from 
DNA;. This paper will investigate how to characterize the 
distance/difference of the various SNPs to discern similarity 
and differences in partial DNA strings. The use of 
information-theoretic variables will be employed to study the 
use of a measure of distance of SNPs via mutual information 
as well as alternative means. 

Since the investigation of the SNPs will clearly 
depend on the appropriate measure of distance/difference 
between candidate DNAs, the use of classical information 
theoretic variables will be employed. Figure (2) displays an 
information theory channel [1-4]. In Figure (2) — Basic 
elements of an Information Channel from Shannon [1]. 
Figure (3) is a Venn diagram of the key information- 
theoretic measures involving two random variables X and 
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Y (Cover and Thomas [2], Sheridan and Ferrell [3], 
Repperger, et al. [4]). 


DNA, /1/2/3/4/3]/2]1]2 


DNA, }1]/2/1]4)/3]|2]1] 2 


pna, | 4/2/2/4]/3]2]2]1 


distance(DNA,-DNA;) > distance(DNA,-DNA,) 


Figure (1) — Three DNA strings with Different Relative 
Distances. 


H(x / y) = Lost 
Information = 
Equivocation 


H(x) = H(y) = 
Input Output 


Information Information 


H(y/x) = Noise or 


Spurious Information 
Fig. (2) — Basic elements of an Information Channel from 
Shannon [1] 
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Fal 


H(X) H(Y) 


H(x,y) = I(xzy) + De(xzy) = Ixezv)+H(\y) + Hi\x) 
The Information Variables in a Venn Diagram 
Figure (3) — A Venn Diagram of the Key Variables 


In Figure (3) the five information-theoretic 
quantities that describe the types of uncertainties (entropies) 
between the input and output elements of the information 
channel in Figure (2) are portrayed. Three of these five 
variables can be shown to be independent. 

From Figures (2,3), the five basic entities of an 
information channel can be expressed as follows: 


H(x) = The input uncertainty to the channel (1) 
H(y) = The output uncertainty of the channel. (2) 
H(x/y) = Equivocation lost to the environment. (3) 
H(y/x) = Spurious uncertainty from the environment (4) 

I(x;y) = Mutual information transmitted (5) 


More specifically, equations (1-5) can be better described by 
letting p(.) represent the probability of an event. For an 
information channel with input symbol set in Figure (2), 

x € X, of size n, and received symbols y ¢ Y at the output set 
of size q (q may not equal n), the following entropy (H(.)) 
relationships can be defined: 


HO) = ¥) p(x,)log, (1/ p(x) (6) 
i=l 
q 
HW) = >) p(y;)log,(1/ p(y;)) (7) 
j=l 
ng 
(x,y) = > p(x, y, log, (1/ p(x,.y;)) (8) 
ij 
ng 
H(x/y) = >) p(%,y;)log,(1/ p(x, ly;)) (9) 


i,j 


nq 
and Hx) = >° p(x,,y,)log,(1/ p(y, 1x)) (10) 
ij 
The important relationships that pertain to the modeling of 
the SNPs are dependent on the key variables (1-5). From 
Figures (2,3) and the basic definitions (6-10), the following 
relationship can be shown to be true (Cover & Thomas [2]): 


I(x:y) = H(x) + Hy) — H(xy) (11) 
where the mutual information I(x;y) also satisfies: 
I(x;y) > 0 (12) 


Finally, another important variable that will be used in the 
sequel is the relative information distance Da(x;y): 

Dry) = H(x/y)+ A(x) =H(x)+ Hy) -21(xy) (13) 
where Dp(x;y) also has a positivity property, as in equation 


Dx(x;y) > 0 (14) 
There are advantages the variable Deg provides over I(x;y) 
which are known in the literature (Cover & Thomas, [2], [5], 
and [6]) and restated here: 


Property 1: Dr(x;y) is a metric; however, I(x;y) is only a 
measure. Please see the appendix and a counter example 
where I(x;y) fails as a metric by not satisfying the triangular 
inequality. 

A second property can be stated as follows: 


Property 2: The relative information distance metric 
Dp(x;y) is the complement of I(x;y), i.e. 


D, (sy) =I Gey) or IGey)=Dzy) 15) 


Appendix A demonstrates this second property. 
2. Methods and Technical Solutions 
Contingency tables (Sheridan and Ferrell [3], Repperger, et 


al., [4], and Kullback [7]) will be used to formulate the SNP 
similarity and difference problem to utilize information- 
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theoretic quantities in examining the distance/difference 
between DNA strings. 

Using the DNA strings in Figure (1), Contingency 
Table | is constructed which compares DNA, versus DNA> 
in terms of similarity and differences. Contingency Table 2 
then compares DNA, versus DNA, and finally Contingency 
Table 3 compares DNA; versus DNA3. 


Contingency Table 1 - DNA, versus DNA, 


DNA, —> 
DNA, A=1 C=2 T=3 G=4 


Contingency Table 2 - DNA, versus DNA, 


DNA3—> 
DNA, A=1 C=2 T=3 G=4 
L 
A=1 - 1 - 1 
C=2 1 2 - - 
T=3 - 1 1 - 
G=4 - - - 1 
Contingency Table 3 - DNA; versus DNA; 
DNA;3—> 
DNA, A=1 C=2 T=3 G=4 
L 
A=1 - 2 - 1 
C=2 1 2 - - 
T=3 - - 1 - 
G=4 - - - 1 


Next, a normalized matrix is calculated based on the total 
number of responses in each table. The normalized matrices 
are summarized below for Contingency Tables 1-3. 


Table 1 — Normalized 
DNA, —> 


H(x) |, 
DNA, |] 2/8 | 0 | 0 | oO | 28 


0 3/8 0 0 3/8 


1/8 0 1/8 0 2/8 


0 0 0 1/8 1/8 


3/8 3/8 1/8 1/8 


Table 2 — Normalized 
DNA; —> 


H(x) |, 
pna, || 0 | 1/8 | 0 | V8) 2/8 


1/8 2/8 0 0 3/8 


0 1/8 1/8 0 2/8 


1/8 4/8 1/8 2/8 
Hy) > 


Table 3 — Normalized 
DNA, —> 


H(x) || 
DNA, ||) 0 | 2/8 | o | 1/8 | 3/8 


1/8 2/8 0 0 3/8 


0 0 1/8 0 1/8 


0 0 0 1/8 1/8 


1/8 4/8 1/8 2/8 
H(y) > 


To calculate the requisite entropies, the following 
procedures are then employed: 


Step 1: Calculate H(x) across the rows and then 
summing down the column on the right side of the 
normalized matrix (cf. Table 1-Normalized). 
Step 2: Calculate H(y) down the columns and then 
summing across the row on the bottom of the 
normalized matrix (cf. Table 1-Normalized). 
Step 3: Calculate H(x,y) for all cells in the normalized 
matrix. Then 

I(x;y) = H(x) + H(y) - H(xy) (16) 
and 
Dr(x3y) = H(x/y)+H(y/x)= A(x) + HWY) - 2 I(xy;). (17) 


The calculations proceed as follows for Table 1, for 
example: 
H(x) = -2 * (2/8)logs(2/8) - (3/8) logs(3/8) 


- (1/8) log.(1/8) = 1.9056 bits (18) 
H(y) = -2* (3/8) logo(3/8)-2* (1/8)log,(1/8 ) 
= 1.8113 bits (19) 
H(x,y) = - (3/8) logs(3/8) - (2/8) log»(2/8) - 
(3) *(1/8)log.(1/8) = 2.1556 bits (20) 
I(xsy)=H(x)+ H(y)-H(x,y)=1.5613 bits (21) 
Dp =H(x)+H(y)-21(x;y)= 0.594 bits (22) 


28 
Distribution A: Approved for public release; distribution unlimited. 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


Finally, it is noted in Figure (1) that in a distance/difference 
sense, it is expected that: 
dist. (DNA\- DNA;) > dist.( DNA; — DNA:) (23) 

Table 4 summarizes these results. It is seen that 
dissimilarities between DNAs are generally associated with 
large Dr values, small I(x;y) values, and larger Hamming 
distance values. The Hamming distance (independent of 
position) is defined as the percent of cells that differ in a 
dyadic comparison and is the gold standard in discerning 
differences between computer words. 


Table 4 — Distances between the SNPs in Figure (1) 


Distance DNA,- DNA,- DNA,- 
Variable DNA, DNA; DNA; 

Hamming 0.125 0.50 0.50 
I(x;y) 1.5613 1.3113 0.9056 
Dr(x;y) 0.5944 1.1887 1.8444 


Typically a reduction in the value of Dr would be 
accompanied by an increase in I(x;y). For the two random 
variable case (as shown in the appendix), it can be 
demonstrated that Dr and I(x;y) are complements of each 
other (ie. D, = I(xzy) and I(x;y) = Da). The results of 
Table 4 are consistent. As the Hamming distance increases 
(column 2 in row 2) when compared to either column 3 or 
column 4, then I(x;y) decreases and Dep increases, as 
expected. Two counter examples are now presented. 

3. Empirical Evaluation 

The first counter example is illustrated with Venn diagrams 
in Appendix A which shows that I(x;y) is not consistent in 
discerning distance/differences between DNAs because it 
does not satisfy the triangular inequality. 


Case 1 Counter Example with Venn Diagrams 

Please see appendix A for an example using Venn diagrams 
and set theory. This presentation is based on geometric 
arguments. It is shown that I(x;y) violates the triangular 
inequality thus does not satisfy the property of being a 
norm. The second counter example deals with SNPs. 


Case 2 Counter Example with SNPs 

To generalize the counter example, analogous to the Venn 
diagrams in appendix A to DNA identification, the 
following three DNA strings are constructed: 


DNA, }1/2/3]4/3|2]1]3/2)4 
DNA, |}1/3/2]1)/2)4/1)4]3)1 
DNA; |1/2/2)/4/3)/2/3/4]2)4 


Figure (4) — Counter Example in terms of SNPs 
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To show that difficulties may occur by using I(x;y) as well as 
Dp to characterize distance/difference between DNAs, the 
three normalized matrices resulting from the contingency 
tables are displayed for the counter example DNAs in Figure 
(4). Using similar notation, as before, Table 5 portrays 


Table 5 — Normalized 
DNA, —> 
H(x) |, 
DNA, i! 2/10 | 0 0 o | 2/10 
0 0 2/10 | 1/10 3/10 
0 2/10 0 1/10 3/10 
2/10 0 0 0 2/10 
4/10 2/10 2/10 2/10 
H(y) > 
Table 6 — Normalized 
DNA, —> 
H(x) || 
DNA, | 1/1o |} o | 1/10 | 2/10] 4/10 
0 1/10 | 1/10 0 2/10 
0 2/10 0 0 2/10 
0 1/10 0 1/10 2/10 
1/10 4/10 2/10 3/10 
H(y) > 
Table 7 — Normalized 
DNA; —> 
H(x) || 
DNA, | io} 0 | 1/10) 0 | 2/10 
0 3/10 0 0 3/10 
0 1/10 | 1/10 | 1/10 3/10 
0 0 0 2/10 2/10 
1/10 4/10 2/10 3/10 
H(iy) > 


DNA, versus DNA;, Table 6 shows DNA, versus DNA3, and 
Table 7 illustrates DNA, versus DNA;. The calculations 
from Tables 5-7 are summarized in Table 8. Also enclosed in 
this table is the calculation from the Hamming distance, 
which has been a traditional measure of distance between 
computer words [8,9]. 
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Table 8 — Results of the Calculation of the SNPs in 


Figure 4 
Information- | DNA, vs. | DNA, vs. | DNA» vs. 
theoretic DNA, DNA; DNA; 
Variable Table 5 Table 7 Table 6 
(bits) 
H(x) 1.971 1.9710 1.9219 
H(y) 1.9219 1.8464 1.8464 
H(x,y) 2.5219 2.6464 2.9219 
I(x;y) 1.371 1.1710 8464 
Dr(x;y) 1.151 1.4755 2.0755 
Hamming .80 30 .70 
Distance 


Note the following conclusions are reached from Table 8: 

(1) The Hamming Distance (gold standard) is a well 
accepted metric and will be used as a baseline (ground truth) 
to evaluate the information-theoretic variables investigated 
herein. 

(2) In Table 8, bottom row, comparing column 3 to column 
4 (results of Table 7 versus Table 6), as the Hamming 
distance increased (from .3 in column 3 to 0.7 in column 4) 
then I(x;y), decreased from 1.171 to 0.8464 which was 
expected. Also, Dg increased, accordingly. 

(3). However, when comparing column 3 to column 2 
(results of Table 5 versus Table 7), when the Hamming 
distance increased from 0.3 to 0.8, the I(x;y) should have 
decreased, but it increased from 1.171 to 1.371, which is 
inconsistent. This same inconsistency occurred with the 
variable Dp. Thus the information variables differ in their 
determination of distance/difference between DNAs and are 
not consistent with the ordering provided by the Hamming 
metric. 


Next a discussion is presented on the transitive property of 
key variables and related to the measures and metrics 
discussed so far involving decision making, in general. 


4. Transitive Property of Measures/Metrics 


From Logic: Definition: A dyadic relation R is said to be 
transitive ina set Sif whenever aRband bRc imply 

a Rc. For example, the relation “is greater than or equal” 
satisfies the transitive property for scalar numbers. 


The structure of transitivity is the mainspring of deductive 
reasoning. An argument is said to be deductive when the 
truth of the conclusion is purported to follow necessarily. 
Deductive reasoning is one of the two basic forms of valid 
reasoning. While inductive reasoning argues from the 
particular to the general, deductive reasoning argues from the 
general to a specific instance. The basic idea is that if 
something is true of a class of things in general, this truth 
applies to all legitimate members of that class. The key, then, 
is to be able to properly identify members of the class. Miss- 
classifying (or miss-categorizing) will result in invalid 
conclusions and affecting decision making, adversely. 
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One of the most common and useful forms of 
deductive reasoning is the syllogism. The syllogism is a 
specific form of argument that has three easy steps, for 
example 


1. 
2, 


Every X has the characteristic Y. This thing is X. 
Therefore, this thing has characteristic Y. 


Also, the transitive property makes elimination possible; if 
aR bandb Re, we can eliminate b and assert a R c. 


Finally, as applied to decision making, if a decision is made 
that the distance/difference between two DNAs is greater for 
one pair as compared to another pair, then the data may be 
mined out if the goal was to find highly correlated DNA 
pairs. Using I(x;y) may lead to an error by mining out more 
correlated pairs of DNAs. If a distance metric such as the 
Hamming distance (as discussed in this paper) were 
employed, then the conclusion would not suffer from that 
error. As mentioned previously, the weakness of the 
Hamming distance is that it is a relative measure, not an 
absolute measure (the position of where the SNPs are is 
lost). 


5. Significance and Impact 


Decision making based on closeness as measured by 
distance/difference between candidate DNAs is critically 
important if DNA analysis is used to make accurate 
determinations in data. Problems of consistency are seen 
when selecting Dr and mutual information (I(x;y)), being 
widely used in the literature. The property that Dr is a metric 
and I(x;y) is only a measure, demands that proper decision 
making should be predicated on at least a good measurement 
tool (Dp in lieu of I(x;y)). Apparently Dp satisfying the 
triangular inequality still does not guarantee consistency in 
the decision making, as shown earlier, on the decision 
regarding simple binary choice of a string of DNA being 
more or less similar. 


6. Future Work and New Research Directives 


As mentioned previously, the classification of the similarity 
and differences between sample DNAs and the causality 
mapping between the SNP’s scripts with the phenotype traits 
is a wide open area of research. A discussion on some of the 
fundamental problems in this area and possible solutions are 
now conducted. First some basic history is presented. 

The human genome project has its early roots in the 
1940’s when the Department of Energy made an effort to 
develop new energy resources and still understand the 
potential health and environmental risks associated with 
these resources. In 2001, two publications [10, 11] described 
the initial sequencing and analysis of the human genome. By 
2003, the sequencing was completed, two years earlier than 
anticipated. The generalizations are now far reaching. The 
DNA in each human cell is packaged into 46 chromosomes 
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arranged into 23 pairs. Each chromosome contains many 
genes (approximately 25,000 for the human genome), which 
are the basic physical and functional units of heredity. Genes 
are specific sequences of bases that encode instructions on 
how to make proteins. It is from the action of the proteins 
that the phenotype traits emerge. 

Previously discussed, SNPs are variations in the 
DNA that may be extracted as SNP arrays by microchips or 
by other processes. The question arises if the sample is 
representative of that portion of the DNA string being 
relevant to the phenotype trait of interest? This is better 
understood from some other properties that reside within the 
human DNA sequencing: (1) Only about 2% of the genome 
actually encodes the instruction for the synthesis of proteins, 
(2) The human genome sequence is almost (99.9%) exactly 
the same in all people, and (3) particular gene sequences in 
animals have been associated with numerous diseases and 
disorders, including breast cancer, muscle disease, deafness, 
and blindness. For example, in a mouse, [12] cancer 
susceptibility can be related to new gene-mapping resources 
and specific genes can be indentified that concur with mice 
that contract the disease. 

The tumor classification problem is of high interest 
in the field of bioinformatics [13]. The design of the 
candidate chips to extract the fragment DNA (SNPs) is a 
problem of considerable concern. Such systems are far from 
perfect and the environment can exert an undue influence in 
the process. The environment can mutate certain genes, thus 
producing a gene with a higher vulnerability to disease. For 
example, exposure to smoking is known to mutate a gene 
and thus produce cells that may start developing cancer. 
Thus if only one difference occurs in a base pair, this is still 
very important to capture since it may greatly influence a 
phenotype trait. 

The future problems that may be studied in this area 
can be investigated in an algorithmic way on how certain 
SNPs signatures may result in a phenotype trait. The trait 
could be a “good attribute” like resistance to disease, 
increased strength, size, and other qualities. Alternatively, 
the modified SNP signature may also be a “bad attribute” 
including susceptibility to viruses, diseases, etc. For 
simplicity of discussion, the presumption will be made that 
the phenotype trait will exist in only two states, e.g. 

Phenotype trait 1: No disease outcome (being resistant 
to a specific disease). 
Phenotype trait 2: Being vulnerable to a specific disease. 


Assume four DNA samples are taken from four individuals 
that equally fell into one of the two states above. Table 9 
would classify the four DNA samples: 


Table 9 — Four DNA samples obtained 


Individual Number | No Disease State | Disease State 
1 DNA, 
2 DNA, 
3 DNA, 
4 DNA, 
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Recall that only 2% of the DNA is related to producing 
proteins that will affect the phenotype outcomes, then to 
develop the similarities and differences between the sample 
DNAs in Table 9, the following six steps should be 
conducted: 


Step 1: Remove all common alleles (this includes the 
98% of the DNA not associated with the protein 
production). Let the symbol © represent those common 
cells (alleles) that are not related to differences between 
the DNAs. In a set theory description, it represents the 
intersection of all the sample DNAs, 1.e. 

Q= DNA; ~N DNA, ~ DNA; 4 DNA,y (24) 
Then let the underlined notation characterize that part of 


each DNA; different from the common intersection of 
all SNPs, i.e. 


SNP, = DNA; -Q (25) 
SNP, =DNA)-Q (26) 
SNP; =DNA;-Q (27) 
SNP, =DNA4- Q (28) 


Next, from Table 9, take those common SNP values for 
the diseased State: 
Step 2: SNPp = SNP, ™ SNP, (29) 
To characterize those common SNP values for the non 
diseased state: 
Step 3: SNPyp = SNP; 7 SNP; (30) 
Step 4: Now check if the disease and non diseased SNP 
portions are mutually exclusive: Is it true that: 
SNPp ia) SNPyp = ) 
where @ is anempty set? If (31) is not true, then 
recalculate steps 1-3 until the result in equation (31) is 
satisfied. 
Step 5: Now repeat steps 1-4 for more than two 
individuals. 
Step 6: With a sufficient data base built up on the two 
classes {SNDp} and {SNDyp} predictions can then be 
made for individuals outside the data used to develop 
the two classes. This will test the efficacy of this 
method. 
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Appendix A — Counter Example 1 — With Venn Diagrams 


Since geometric proofs using Venn diagrams are not technically 
permissible (Eves, [26]), we show as a test of relationships 
properties 1 and 2. In this appendix, it will be stated (Cover and 
Thomas, [2]) that Dp is a metric and satisfies the follow following 
four relationships for a metric (x,y): 


(M-1) p(xy) > Oif x # y. (positivity) (A.1) 
(M-2) (x,y) = ply,x) (similarity) (A.2) 
(M-3) (x,z) < p(x,y)+ ply,z) (triangular inequality) (A.3) 
(M-4) a(xy)=0 ifandonlyif x=y (A.4) 


However, it is shown by a testing example below that I(x;y) violates 
equation (A.3), ie. 

I(x;z) > Ixy) + I(y;z) (A.5) 
for three random variables X, Y, and Z. 


Part A — A Constructed Example to Show That Equation (A.3) 
is Violated: 


Figure (A-1a) is presented to define areas A;, A>, and A; consistent 
with Figure (3): 


Aly) =A; (A.6) 
A(x) = Ag (A.7) 
Ixy) = A2 (A.8) 


Figure (A-1b) now generalizes this concept to three random 
variables X, Y, Z. In terms of the seven areas (A,-A7) displayed, 
the following relationships become generalizations of Figure (A-1a) 
into Figure (A-1b): 
I(x;y) = Az (A.9) 
H(x/y) = Aj. (A.10) 


Random Random 

Variable Variable Ni; 
FE Ts N\, LY 

me H(X|Y) | 


x:Y)\ \ 
[O)) Hovix) | 
\ AreaA, \ / 
& AS 


/ AreaA; / 
7p 8 


Ap 
Figure (A-1a) Two Random Variables X and Y (left) 
Figure (A-1b) — Three Random Variables X, Y, and Z 


A(y/x) = As, (A.11) 
A(xly) = Ar + Ag, U(xiy) = Az + As (A.12) 
H(y|x) = A3 + Ag, I(ysx) = As + Ap (A.13) 
H(z|x) = Aq + Az, I(z;x) = As + Ag (A.14) 
H(x|z) = A, + Az, I(x;z) = Ag + As (A.15) 
H(|z) = A2t+ Az, I(ysz) = As + Ay (A.16) 
H(zly) = Ag+ Az, I(ziy) = Ag+ As (A.17) 

the left and the random variable Y to the right until: 

Ag > Az + Ay + As. (A.18) 

But: Ag= I(x;z) — As (A.19) 

Ay = I(xsy) — As (A.20) 

Ag= I(y;2) - As (A.21) 


Hence from (A.18): 
I(x;2) —As> I(xsy) — Ag+ I(y.2)-Ast+As  (A.22) 
by construction, and 
I(x;z) > I(xsy) + yz) (A.23) 
Thus it is demonstrated that equation (A.5) is satisfied and 
condition (A.3) is violated. 


X 
x 


Q 
(a, Y 
7s 


Figure A-2 — Counter example to show (Ag > Az + Aq + As) 


Property 2: Da and I(x;y) are complements of each other. 
It is intriguing that Dp satisfies the property of a metric but 
property 2 states that its complement I(x;y) does not. With 
reference to Figure (A-1a) we wish to demonstrate that Dp and 
I(x;y) are complements. By definition: 

I(xzy) = H(x) + Hy) -— H(x,y) (A.24) 

Dalxsy) = H0v/y) + HOv/x) (A.25) 

Let S represent the entire space in Figures (3) and (A-la). 
Then let e be an element of S and S = A; U Ay U Az where U 
indicates the union of sets. Note A;, Az and Ag are disjoint sets 
in Figure (A-la). From (A.11-A.12) and Figures (5) and (A- 
1a) it follows that all the elements of I(x;y) are in Ap and all the 
elements of Dg(x;y) are in A; U A3. For notational simplicity 
denote the complement of a set A as A’, then (Eves, [26]) two 
cases now exist: (1) ife € A; U As, thene ¢ Ap, thus 


(Ai U As)’ = (Ap) (A.26) 
or (2) ife € A», then e € A; U As, thus 
(Az)’ = (Ai U As) (A.27) 


It then follows that if e is an element of (A; U As) then e is an 
element of (A2)’, and if e is an element of (A), and if e is an 
element of (A; U A3)’, whence (A,) and (A; U Ag) are 
complements. 
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Runge Phenomenon: A virtual artifact in image processing 
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Abstract— Interpolation using variate functions, such 
as polynomials and trigonometric functions are common 
methods used to process discrete signal samples. Since most 
electronic devices used for quantization provide equally 
spaced signal samples, these interpolation methods are 
often used with data which is equally spaced along the 
abscissa. Unfortunately these approximations often exhibit 
the undesirable behavior of oscillations near the boundary 
of the domain which cannot be removed by increasing 
the degree of the interpolating polynomial. Regardless of 
the fact that these artifacts appear in many applications, 
including signal and image processing, Runge phenomenon 
often fails to differentiate from other similar artifacts such 
as the more popular Gibbs phenomenon. In this article, 
we present numerical examples to illustrate the Runge 
phenomenon in the context of signal and image processing. 
Both simulated and field data are used in the numerical 
experiments. 


Keywords: Runge phenomenon, Lebesgue constant, poly- 
nomial interpolation. 
2000 Mathematics Subject Classification: 65M10, 78A48. 


I. INTRODUCTION 


To transmit and store analog signals effectively, an 
analog-to-digital converter (ADC) is used to convert the 
domain of the signal from continuous-time to discrete- 
time. This follows by the reverse operation, an digital-to- 
analog converter (DAC) coverts the digital signal back 
to analog format such that the signal to be recognized by 
human senses or other non-digital systems. One of the 
central parts in both ADC and DAC is the interpolation 
circuits. 
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In practice, a wide variety of interpolation methods 
and techniques are in use based on different function 
spaces and sampling methods (the choices of nodes). 
Polynomials and trigonometric functions are more pop- 
ular among others. Since most electronic devices used 
for quantization provide equally spaced signal samples, 
these interpolation methods are often used with data 
equally spaced along the abscissa. Unfortunately the 
interpolation approximations on equally spaced data are 
often exhibit undesirable behavior near the boundary of 
the domain especially as the degree of the approximating 
polynomial increases aggressively. This behavior, oscil- 
lations near the boundary which cannot be removed by 
increasing the degree of the interpolating polynomial, 
is the so called Runge phenomenon. Although these 
oscillations appear in many applications, yet the Runge 
phenomenon often fails to differentiate from other simi- 
lar artifacts such as the more popular Gibbs phenomenon. 


Some methods such as using lower degree spline 
interpolation are introduced to overcome the artifacts. 
The interpolating at the non uniformly distributed nodes 
such zeros of Chebyshev polynomials and Barycentric 
Lagrange interpolation are examples of classical methods 
[2]. There has been renewed interest in Runge phe- 
nomenon in the past a few years. Some new numerical 
methods are developed to overcome the phenomenon, for 
example, Tikhonov Regularization and spectral methods 
are alternatives [3] and [4]. Almost all these methods are 
not ADC/DAC friendly because they are based on non 
uniform sampling. 


The primary interest of this article is twofold: 1) quan- 
tify the Runge phenomenon using Lebegue constant and 
construct numerical examples to illustrate its properties 
in the context of signal and image processing and 2) 
discuss new ADC and DAC friendly method. 


The outline of this article is as follows: a brief review 
of the Runge phenomenon will be given in Section 2. 
In Section 3, using the Lebesgue constant as a tool to, 
quantify and illustrate the phenomenon for signals (one 
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dimensional case) and images (two dimensional case), 
respectively. Section 4 is devoted to construct interesting 
numerical examples using audio signals and images to 
illustrate the phenomenon in context of signal and image 
processing. We conclude the paper by a brief summary 
and a remark on our future study. 


II. BACKGROUND ON THE RUNGE PHENOMENON 


To begin, we recall the well-known Weierstrass Ap- 
proximation Theorem and the Lagrange interpolation. 

Theorem 1: (weierstrass) Let R be a continuous func- 
tion on a closed and bounded interval of R. Then f can 
be uniformly approximated by polynomials. 

There are several different ways of proving this im- 
portant theorem. A detailed proof can be found in most 
Real Analysis textbooks, including [1] and [19]. 

In the spirit of the Weierstrass Approximation Theo- 
rem, polynomial approximation plays the most important 
role in practice. Among many other polynomial approx- 
imation, the Lagrange interpolation are important both 
in theory and practice. In the one dimensional case, the 
Lagrange interpolation can be described as follows. 

Let X = {20,2%1,...,Ln|n € N*} such that Vr; € X 
the following holds. 


a< m9 <u. << UjH-1 << Bi < B41 <a << In KO 


For any f € Cla,b], we can define the Lagrange 
interpolating polynomial P,,(f; X;x) which agrees with 
f(a;) at each of the points x; € X in the standard form 
[2]. 


Pa( fi Xa) “EK wi)L 
and 
Lt) = Il eee) (1) 
k=0,iFk (i — tn) 


Carle Runge, a student of Weierstrass, later detailed a 
seemingly paradoxical observation [20]. He noted that 
the approximation of f could diverge as the degree 
of the polynomial approximation increased and stated 
conditions for this divergence. This is also true for the 
Lagrange interpolation. It also suffers from the Runge 
phenomenon. This phenomenon is the observed effect 
of a polynomial approximations error tending towards 
infinity near the extremes of the interval over which 
interpolation occurs as the degree of the polynomial 
approximation increases. A commonly used metric for 
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measuring the accuracy of an approximation is the 
Lebesgue constant [8], [9] and [14]. For one dimensional 
polynomial approximations the Lebesgue function and 
constant are defined below [8] and [17]. 


r) = |Li(Xs2)| (2) 
i=0 
Ag(X) = mar Ay X32) (3) 


x€[a,b] 


With this groundwork, we may proceed in presenting 
some examples of the phenomenon. In the following 
sections, we will present numerical studies of the phe- 
nomenon for polynomial interpolation in one and two 
dimensions and observe how variations in node spacing 
affect the phenomenon. We will also apply these same 
interpolation techniques to audio data to observe how 
they behave in more typical situations. 


III. QUANTIFY RUNGE PHENOMENON 
A. Numerical Experiments in One Dimension Space 


1) Lebegue constant in one dimension case: To exam- 
ine the Runge prenomenon in one dimension, we have 
chosen the function ; + =>, x € [—5, 5]. In Figures 1 and 
2 the 10” and 15” degree interpolating polynomials of 
the function with equally spaced points over the domain 
are shown with the corresponding Lebesgue functions. 
These figures show the approximation of the function 
diverging near the boundaries and also show that this di- 
vergence accelerates as the degree of the approximating 
polynomial increases. In pee it can be shown [13] and 
[18] that A,,(X) >C x “r for Lagrange interpolation 
over equally spaced nodes. 

Since the Lebesgue function for Lagrange interpola- 
tion depends solely on the position of the interpolation 
points it is natural to ask if there is a spacing or set 
of points which could be used to minimize A,,(X) for 
any given function? An answer to this question can be 
found in [18] where Rivlin shows that there is not a 
set of points which can assure uniform convergence for 
all continuous functions over a given domain. This fact 
follows from establishing a lower bound for the A,,(X) 
which approaches oo as n — oo. This bound is listed in 
equation (??) and can be found in [14] and [18]. 


2 
An(X) < ~log(n) + ce >0,n= ily ere 


We can get relatively close to the lower bound by 
interpolating at the roots of the Chebyshev polynomial 
T,. The upper bound of A,,(7Z;,) is listed in equation (5) 
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K(X) <= 16a) ae eel eee (5) 
T 


2) Examples: In figures 1 and 2 we present the 
result of Lagrange interpolation using the roots of the 
polynomial T;, as interpolation points. It should be noted 
that the roots exist within [—1, 1] and we have scaled 
these to fall within [—5,5] while maintaining the same 
relative distance to one another. In table ?? we have 
presented approximations of A,, for both cases. 


1/(14x°) and Lagrange interpolating polynomial of degree 10 using equidistant spacing 


Lebesgue function 2, (X;x) for equally spaced interpolation nodes 
30/7 r 
ast! \ / \4 
| \ | | 
\ Pf 
204 | | q 
ze | 1 out 
Sy | ; 
oN | 
tor | 
\ | 
ae ee 
\ s a nse, J 
0 if oe Se i 
-5 -4 -3 -2 -1 0 1 2 3 4 5 


” 


Fig. 1. The Lagrange interpolating polynomial of degree 10 for the 
graph of esd along with the Lebesgue function for this interpolating 
polynomial. This graph is spaced using equally spaced nodes. 


1/(1+x°) and Lagrange interpolating polynomial of degree15 using equidistant spacing 


5 4 3 2 A 0 1 2 3 4 5 
x 
Lebesgue function 2,,(X:x) for equally spaced interpolation nodes. 
600 

500} /\ /\ 
| \ /\ 
400} | {| 
= | \ {| 

X 300} | | 

200F | | 

\ / 
‘oop | | | 

Ge VEN IGN, 
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Fig. 2. The Lagrange interpolating polynomial of degree 15 for the 
graph of ae along with the Lebesgue function for this interpolating 
polynomial. This graph is spaced using equally spaced nodes. 


(see Table 1. for details). 


1/(14) and Lagrange interpolating polynomial of degree10 using the roots of T,,. 


A(x) 


\ \f / 
| \J Vy 


Vv Lv 


Fig. 3. The Lagrange interpolating polynomial of degree 10 for the 
graph of eed along with the Lebesgue function for this interpolating 
polynomial. This graph is spaced using the roots of the Chebyshev 
polynomial of the first type. 


1/(14x2) and Lagrange interpolating polynomial of degree'15 using the roots of T, 


\ 
| 
| 


\/ 
y 
0 
: 


Fig. 4. The Lagrange interpolating polynomial of degree 15 for the 
graph of => along with the Lebesgue function for this interpolating 
polynomial. This graph is spaced using the roots of the Chebyshev 
polynomial of the first type. 


Table 1. Approximation the Lebesque’s constant 


Mn(Tn+1) 
1.00000 1.41421 1.44127 
2 1.25000 1.66667 1.69940 


1.63113 1.84776 1.88254 
2.20782 1.98885 2.02460 
3.10630 2.10440 2.14067 
6.92974 2.28702 2.32381 
10.94565 2.36186 2.39880 
17.84861 2.42883 2.46587 


10 | 29.89996 2.48943 2.52655 
11 |} 51.21422 2.54477 2.58194 


12 | 89.32491 2.59568 2.63290 
13 | 158.10236 | 2.64282 2.68008 
14 | 283.21120 | 2.68671 2.72400 


15 | 512.35146 | 2.72778 2.76508 
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3 second sample of violin, interpolated with 12 piecewise polynomialsof degree 75 
x10" using the roots of the Chebyshev polynomial Tn for node spacing. 


1d: 25 
seconds 


0.5 1 


Fig. 5. A 3 second audio sample approximated using twelve 75 
degree polynomials whose nodes are spaced using the roots of T7,. 


B. Numerical Experiments in Two Dimension Space 


In this section, we shall extend our examination of the 
phenomenon into two dimensions. For the examples in 
this section, we will be approximating z = Tey LE 
[—5,5],y © [—-5,5]. This is a natural extension of the 
previous formula into two dimensions. 

1) Lagrange Interpolation: In this section, the exam- 
ples and numerical results have been developed using 
a two dimensional form of the Lagrange interpolating 
polynomial listed in equation (6), Lebesgue function 
listed in equation 7 and the Lebesgue constant listed in 
equation 8. 

In the context of image processing, samples are 
most commonly taken by horizontal and vertical scans. 
Therefore, the separable base functions formed by direct 
product are used most often. We will limit our discussion 
to the separable case. There is active research on other 
more exotic partitioning schemes (see [5], [21]), but 
these are mostly of theoretical interest. 

We define the Lagrange interpolating polynomial in 
two dimensions by 


n m 
P(f;2,y) =>_>° fea y)Li(X;2)L;V3y) © 
i=0 j=0 
Where |X| = n, |Y| = m and the terms L;(X;2) and 
L;(Y;y) are computed using equation (1). In the two 
dimensional case, our Lebesgue function will be. 


Anm(Xi¥32,9) =o SO |LA(Xs2)L,(¥3y)] 1 
420 420 


Neel AS Y= pee nm X; Y5054)) 


(8) 
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We will limit ourselves to presenting a numerical 
exposition of two different partitioning schemes. First we 
will present the results for an equally spaced partitioning 
scheme this will then be compared with spacing along 
the x and y axes according to the roots of the Chebyshev 
polynomial. This should parallel our discussion in the 
one dimensional case. 

2) Equally Spaced Partition: The Runge phenomenon 
can also be observed in the two dimensional case. In the 
contour figures 6 and 7, where the interpolation nodes 
are marked with a ‘x’, we can observe what appears to 
be the approximation diverging from the actual value of 
the function towards the extremes of the domain with 
the larger polynomial approximation. This divergence is 
characteristic of the Runge phenomenon. 


8 degree polynomial approximation of 1 / (1 +x° + y*) 
—r 


= 


Fig. 6. A contour plot of the approximating polynomial of degree 
8. 
30 degree polynomial approximation of 1/ (1 +x° + y*) 
5 ao v7 10 
Pe ge Oe 
4 © Soe, 2 9 
KX XK KX x Q-K x x Kx x 
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Tea cacate atop tehte sad vee anna hs a 
2 ri 
1*# xX x x x x x x Ae 6 
ee le 
ye ae a ate we esl I e 
| 
AAD aha ced Be coe 2g! Bh ay hele \y 4 
a a a x , 
of3. : eee Se 2 
a aes aa ; 
(A) =>) 
_5 ae, ne 0 
=5 0 
x 
Fig. 7. A contour plot of the approximating polynomial of degree 
30. 


In observing the phenomenon, it is also useful to 
consider the Lebesgue function of the approximations. 
Figures 8 and 9, show the Lebesgue functions for the 
polynomial approximations of degree 8 and degree 14 


Distribution A: Approved for public release; distribution unlimited. 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


respectively. In these figures, it can also be observed that 
the Lebesgue function also increases towards the edge of 
the domain when an equally spaced partitioning scheme 
is used to select the interpolation nodes. 


sical function for or eavally spaced nodes sails 5, Sh 


te 
ee 


A / 


Fig. 8. A contour plot of the Lebesgue function for two dimensional 
Lagrange interpolation with maximum degree 8. 25 equally spaced 
nodes within the domain are used. 


lebesgue function for equally spaced nodes over [-5, 5]. 


eos 


Ss 
3H CoF) om 
eee 


Fig. 9. A contour plot of the Lebesgue function for 2D Lagrange 
interpolation with maximum degree 14. 64 equally spaced nodes 
within our domain are used. 


IV. RUNGE PHENOMENON IN SIGNAL AND IMAGE 
PROCESSING 


A. Runge phenomenon in audio signals 


While considering the one dimensional case, it would 
be interesting to consider how these methods perform 
with real data. To illustrate how polynomial approxima- 
tion performs under real conditions we will approximate 
a brief audio sample using several high degree polyno- 
mials in a piecewise fashion using both equally spaced 
nodes and nodes spaced using the roots of the Chebyshev 
polynomial, 7;,. The results can be seen in Figure 10 and 
Figure ?? respectively. 


Both methods of approximation poorly approximate 
the original data whose range is [—1, 1]. It should be 
noted however that the approximation using the roots of 
T,, did perform better than the technique using equally 
spaced nodes, as can be observed by the maximum 
values attained in each figure. This seems to fit with our 
earlier observations that the spacing of the interpolation 
nodes affects the quality of the approximation. 


3 second sample of violin, interpolated with 12 piecewise polynomials 
x 10% of degree 75 using equally spaced nodes. 
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Fig. 10. A 3 second audio sample approximated using twelve 75 
degree polynomials whose nodes are equally spaced. 


B. Runge phenomenon in images 


We illustrate sample images of Lenna with 2d La- 
grange interpolation applied using blocksizes of increas- 
ing size. It shows the phenomenon pretty well especially 
in the blocksize 11 case. Results are shown in the fol- 
lowing figures. Experiment results are shown in Figures 
11 to 15. 


Fig. 11. 


Runge phenomenon: block size=3. 


V. SUMMARY AND FUTURE STUDY 


The approximation of variate functions through in- 
terpolation techniques is common in many disciplines. 
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900 = 1000 


Fig. 12. Runge phenomenon: block size=5 


800 


900 1000 


Fig. 13. Runge phenomenon: block size=7 


It is also common to use the same techniques while 
developing models for discretely sampled data. The 
Lagrange interpolation is a practical method to apply 
and is important for the rich history it brings to the 
field. Unfortunately, when the interpolating nodes are 
equally spaced, the Lagrange interpolation technique 
often produces undesirable results especially when poly- 
nomials of high degree are used. In this paper, we have 
provided numerical examples which demonstrate the 
Runge phenomenon in both one and two dimensions. We 
also show numerically that the selection of interpolation 
nodes can have a dramatic affect on the quality of an 
approximation. Using the Lebesgue constant, we have 
also presented some discussion on how to quantify this 
phenomenon. As illustrated in section 3, the Runge phe- 
nomenon can be removed by interpolating at the zeros 
of Chebyshev polynomials. Since zeros of Chebyshev 
polynomials are distributed non uniformly, it can not 
be used to the data directly from ADC or DAC. Our 
current study are focusing on the interpolating at zeros 
of Slepian functions. With a connection map, the data 
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Runge phenomenon: block size=11 


Fig. 15. 


(uniform sampled) from ADC or DAC will be mapped to 
non uniform distributed Slepian zeros, the interpolating 
polynomial is then constructed. Related the results will 
reported in next conference - IPCV’13. 
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APPENDIX A 


Investigation of Complex Networks and Information Theory at WPAFB 
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Investigation of Complex Networks and Information Theory at WPAFB 


D. W. Repperger 
711 HPW AFRL/RHCV 
Wright-Patterson Air Force Base 
Dayton, Ohio 45433 


Topic 1 — Logistics System — an interesting application (performance /vulnerability). 


Topic 2 - Architecture — Examine Random versus Scale-free structure in networks. 


Topic 3 - Fractional Gaussian and Brownian Noise Studies (design of networks). 


Topic 4 - Percolation Networks — Cellular automata. 


Topics 5+ - Circuits (AFIT work), Fractal Filter (show applicability). 
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Many Areas Interact for Network Analysis 


Optimization 


Information 


Theory 


Bioinspired 


Fractals 


Percolation ; 
Fractional 


Calculus 
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Topic | 


Logistics 
System 
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Step | — What Architecture /Structure to use Topic 1 — Logistics System 
Minimum Interactions/communications Maximum Interactions/communications 


Case 2 — Full Spam 
Case 1 — Only Communications 
Communications (Eo 
Through the ATOF 


FS 
(Fleet 
Services) 


(Air Terminal 
Operations Flight, 


fi 
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Step 2 - Find the Equations of Motion Topic 1 — Logistics System 


How Constraint Equations Arise? 
ATOF: f, + fotiitietlo=fet it iets: +f, 


PS: Fart f, =fot+fothro 
RS: fist fot fort fis = frot fs 
FS: hiot fiat fe = fiat fist fat fay 
Cut set in circuit theory CS: fsthia = fist fe 
Cut set in Graph Theory 
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Iz Distribution A: Approved for public release; 


ee unlimited. & 
, ~ oe 
88ABWIPA Cleared 08/20/2013; 88ABW2201 @ecccccce’” 


Step 3 — Optimize Flow Topic | — Logistics System 


Maximize Mutual Information Minimize Mutual Information 


45,883 runs, saving wee soeac anne 


235 in the fitness pool. 


113 in the fitness pool. 


Optimize via Genetic Algorithms 
I(x;y) = H(x) + H(y) — H(xy) Hx) = LPC logs (U/ p(x) 


14d 
Hixy) = > P(4;,¥, log, (1/ pO, ¥,)) Hy) = Y PCy oss PO) 
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Topic | — Logistics System 


How the Optimization is Conducted (Elite Pool) 


j=1,..., 11 free chromosomes 
3 bit word for each chromosome. 


011) 
- : i icew flows. N-P 


5 constraints (4 independent). 
zen 1 0 4 15-4 = 11 independent flows to determine. Hard 
si Possipititiog= (ae) {e tary e (2°) = 8" 


Bump Out 


(8'' possibilities, NP Hard) 
Fig. 9 Configuration for the Chromosome 


Step 4 — Define Vulnerability/Sensitivity Topic 1 — Logistics System 


Let T = cut set flow, let W be the MI = /(x;y). 
AT 


Maximum Flow Minimal Flow 
_ 7 \_ _7 \__ 
( [nt | a |S 
\ /, _N /., 
—/, y, \ y, 

~= — = ~= — © 


Cut set: flows in = flows out te Cut set: flows in = flows out 


— 1 Distribution A: Approved for public release; distributiorqunlimi 
~ 10 un Its 88ABW/PA Cleared 08/20/2013; SBABW-2013°3746. UAit 


Topic 1 — Logistics System 


Sensitivity function in equation (32) for 5 computer runs 


25 + 


20 


ATOF vs PS for 5 computer simulation runs 
Figure (12) —The sensitivity Function defined in equation (32) for ATOF vs PS T W O 
di 


Sw'Dimensionless Units 


ATOF PS 
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Random Graph (Gaussian, thin tail) *"*’ 


(Highways on a map) Erdos-Renyi 


Vulnerability: Random graphs 
> flows are more robust under 
intentional attacks versus 
scale free. For random attacks 
_ oO ee all nodes should have about 
3 3 ! the same level of robustness. 


Involving cut set 
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Node number 
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Scale Free Graph (Power Law, Fat Tail) ie 


(Airports in 3D Space) 
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(> flows 
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Topic 3 — Fractional Gaussian and Brownian Noise Studies (fractional calculus). 


Consider a time series signal p(t) 


Statistical Properties 
(e.g. uand o) 


Information Properties 
(e.g. persistence, memory, 


correlation. approximate entropy) 


Topic 3 — Fractional Gaussian and Brownian Noise Studies (fractional calculus). 
power 


d= ee 7 
Bit)= |Ct)dt = ——_ Ct), P(t)=pinknoise(1/f)= “— C(t) log 
“= St, Pi=p (= £— of N 
dt * 
Fractional G. Fractional B. F 
Strongly log 
Persistent B =-slope 
Anti Non 
Persistent stationary 


Rosetta Stone for Fractional G and B: 


B = 0 = Gaussian Noise, zero 
persistence, stationary 


B= 2 = Brownian Noise 
B = 1 = Pink Noise (1/f noise) 


B > 0= Non stationary 
Weakly 


2 
Persistent B >1= Strongly Persistent 


O< B <1=> Weakly Persistent 
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Topic 3 — Fractional Gaussian and Brownian Noise Studies (fractional calculus). 


The Fractal Filter Concept — M. Barnsley The Fractal Filter Concept — M. Barnsley 


Observer Observer 


Terrorist Terrorist 
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(Wavelet Operator — R. 


Ewing RY — M. Barnsley 
D. Sonntag AOARD) 


Multiple Filters that are Fractal 
Multiple Partners: D. Sonntag AOARD, UM. Barnsley 


One Filter 


Points: Make a network (fractal geometry) resemble 1/f 
dynamics. No characteristic scale => No inherent limits of 


resolution. 
Why? Present values proportional to recent and far history 
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Topic 3 — Fractional Gaussian and Brownian Noise Studies (fractional calculus). 
(Persistence) 
Power Spectrum . . , 
White Noise = 1/f 


. X 
Brownian s& 
Noise = 1/f? ‘, 


Pink Noise = 1/f 


Timescale-Cycles per second 


1 
8 = 
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Topic 4 — Percolation Networks — Cellular Autonoma 
9,999 Points 1,111 Points 


omc qr. . 


eS 
ea - 


et 
EL 
al 


Ps ole 


How to use this m.o. to Analyze Networks? 
( Extant.work..on.Dynamics) 
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Topic 4 — Percolation Networks — Cellular Autonmata 


Step 1: Draw the architecture of the network (Barnsley support set). 
Step 2: Populate the links randomly. (Monte Carlo) 

Step 3: Test for percolation. 

Step 4: When percolation occurs, continue link population. 

Step 5: Continue until two or more percolation paths occur. 

Step 6: With multiple percolations, now start removing nodes/links. 
Step 7: If percolation disappears, then a critical node/link is identified. 
Step 8: If percolation does not stop, then this is not a critical node/link. 


C) O Now OC) 
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Work at AFIT — Lt. Col T. J. McDonald, Dr. Y. Kim 


Given an electrical circuit (white/black box), can we study its time history and identity 
its function? (Dynamic electronic footprint problem). 


Can the functionality of a circuit be hidden by adding nodes, links, etc. to disguise? 
Foreign countries reverse engineer circuits/chips/software. How do we protect? 


One solution: Make the reverse engineering N-P hard (maximum entropy). 
Maximize (spatial entropy) 


Shannon-Fano construct 
Let p = sum of all 


final nodes. 
ra 


Number o 
questions 
Dp 
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Summary and Conclusions 


Topic 1 — Logistics System. 
Topic 2 — Architecture Issues (random vs scale-free graphs). 
Topic 3 — Why are scale free systems in nature so optimal? 


Topic 4 — Percolation Networks — Vulnerability, discover solutions, 
dynamics. 


Topic 5 — Circuits/software at AFIT, Master’s students ongoing. 
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APPENDIX B 


Fractional Calculus — A New Paradigm for Understanding Complex Systems 
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Fractional Calculus — A New Paradigm for Understanding Complex Systems 
D. W. Repperger!, K. E. A. Farris! , R. Bradford! 


'_ 711 HPW, Air Force Research Laboratory, RHCV, WPAFB, Ohio 45433 


C-IED Capabilities 


Timeline for single IED event 


——_~ 


Inputs e 
: Va > » 
(tea > 


( 
node 


node 


@z%, Dstlis part of the 
Tee Ministry of Defence 


d tl 21 March 2007 
Ss © Dstl 2001 


Outputs 


Figure 3 — The Original Network-Centric Distributed System 


. Originally we are interested in flow performance of networks. 
Topics such as performance and vulnerability are important. 


.From nature we borrow concepts on optimal flow with the side 


advantage of robustness. 


. Fractional Calculus is a paradigm to investigate these 
structures and involves fractals, also ubiquitous in nature. 
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Part 1-B- Background Material 


Optimization 


Information 


Bioinspired 


(Fractals) Carea2 > 
Caveat) Fractional 
calculus 
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KJ 


Part 1-B- Background Material 


Bioinspired - Fractals Se 


Oxygen Diffusion 


0°u(x, t) _@ Ou(x, t) 
Ox? Ot 


(d,)’ = (d,)’ + (d;)’ 


Water Diffusion 


nN T dpe 
doe 
ds 


y= 2.57? Lung MEE 


Fractional 


Dimensions are 


Minimum 
energy — 


They are 
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Part 1- Background Material 


Bioinspired - Fractals Careot > 


. The Latin fractus = “broken” or “fractured” 


. Fractals — scale free (self-similar), irregular overall length 
scales. (self similar means the structure is invariant to change in 
scale). Forever continuous but nowhere differentiable. 


. Fractals may have infinite circumference but finite area. 


. Fractals can have finite volume and infinite area. 


. A fractal can be defined in the sense of a recursive equation: 
Zhe = f(z,,) 


. This is, apparently, the optimal way to distribute flow. 


. Non Euclidean Geometry. 


. Fractal examples (trees (branches), rivers, lighting bolts, cells, lung 
passageways, blood vessels, leaf patterns, cloud surfaces, molecular 
trajectories, neuron firing patterns, etc.). 
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Fractals — Lets Review the Area 


B. Mandelbrot (1960,s) asked the question: “How long is the coastline 
of Britain?” 


(Suppose we measured the 
coastline with a ruler that got 
smaller and smaller?) 


A fractal has statistical self-similarity( power law, self affine). 


A fractal has N identical parts with scale factor L. 


The Fractal dimension is 


Area = L2 


_O1CL 


Length =L Volume = L? 


(Measurement) = LP 


implies log(Measurement)= D (log(L)) 


D- log( Measurement) 


# Integer 
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Fractals — Lets Review the Area Caveat» 


_ log(Measurement) 
log L 


(Measurement) =L? 


LaA%a V3 


For irregular surfaces, we can define: 


Let N = the number of divisions of fixed length. 


Let r = length of a ruler. 


p — ogtotal Length). 
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Fractals — Lets Review the Area Caveat» 


D- log( Measurement ) 
log L 
Total Length = L? where 1<D<2 


Koch Snowflake 


—/\N 


Length = 4 = measurement 


Projection = topological dimension = 3 


oe = 1.26185 


Fractals — Lets Review the Area Different 
versions of the Koch snowflake. 


es a | / 
a \ } 4 
: : a Finite 
: Area 
A, art 
a Hh att LW tH 
: A iG : 
"a 4 wf 
er 
Circumference lim (total length) > 00 
= total length N—>00 
= (4/3)° Power law 
Log(total 


length) 


Biofractals 

21 orders of magnitude 
Microbe = 10°? g 
Whale = 10° g 


Log(1/s) 
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Fractals — Lets Review the Area. 


log( Measurement ) 
log L 


D-= 


How to determine Measurement? 


We “cover” with boxes or disks. 


-- ~ 
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Fractional Calculus — Main Points 


(non Euclidean geometry) 
d’y = 
u(t) 


Nn 
d t =, (Notation 


invented by 
Leibniz) 
What can n be? 
Answer: (In 1695, Hopital asked 


Leibniz, suppose n= %?) 
n = integer = 1, 2, 3 4, 
n = negative integer = -1, -2, -3 
n can be anon integer, n=%, 5/6. 


n can be a negative non integer, n = -.6, -3.4, 


n can be irrational: n= A | 2 
n can be a complex number: J-1 
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Fractional Calculus — Main Points 


(non Euclidean geometry) 


Why Study Fractional Calculus? 


Composite Materials 


10Log,,.(Power Gain) 


Slope = s(¥/2) 


"A 


Db 


Log of frequency 
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Fractional Calculus —Main Points 


Why use Fractional Calculus? 


(1) It can deal with functions that are forever continuous and 
nowhere differentiable (fractals). 


(2) It has the property of self similarity (scale invariance) 


d°’*(ay) Fi d*’*(ay) fi d'’* (ay) 7 d*’* (qu) Fi d‘'* (au) 


d(at)"? d(at)? d(aty'? ~~ d(at)?_— d(at)'” 
[dx ]" [d(bx)]* 


(3) It is also of the form: 


f+ = f(z,,) 


(Iterated function theory). 


(4) It can also solve partial differential equations: 
Zz 
O°u(xX,t) _» ou(%,t) 
Q ution A: Approved for Sibir release; ee! an 
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Fractional Calculus 


An Easier Way to View the Self Similarity Property 


od ar ol 


A power law f(x) = x7has the 
property that that the relative 


change in f (kx) _ 5 
f(x) 


Is independent of x 


In this sense, the functions lacks characteristic scale 


(scale free or scale invariant). Let us evaluate f (kx) 
f (x) 
Let x = y” 
Then P ; 
(kx) _ (Wy) pa Ye 


d 


f(x) y y 


Note: no dependence on x 
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Fractional Calculus —Main Points 


(310 year old area). Non Euclidean Carea2 


Common Properties 


(1) Scale Invariance — Self Similarity. 


d* (bx) _ 4g df (OX) 
[dx] [d(bx)]* 


, bt) _ ky)" _ a _ pe 
” FOX) y° y 


(3) Solves Systems in Nature (Diffusion equation). 


(4) Let m=6 
B 
d oak" = I'(m +1) mB 
dx"... vueeek CIMum afd + 1) 


Fractional Calculus —Main Points 


(310 year old area). Non Euclidean Carea2 


Common Properties 


(1) Scale Invariance — Self Similarity. 


d* (bx) _ 4g df (OX) 
[dx] [d(bx)]* 


, bt) _ ky)" _ a _ pe 
” FOX) y° y 


(3) Solves Systems in Nature (Diffusion equation). 


(4) Let m=6 


d’,_ T(m+) 


: LCMsf2 +1 
dx pproved for publi (1 tion suf 
W, 08/20/2013; 88ABW-2013-3715. 


Fractional Calculus —Main Points Carea2 


(Solution of the Diffusion Equation) 


['(z) = fewudu, PM=Le+)= ae) 


0 


. # 1k 
Thus: = { Te+n=2, pal TQ=vz 


Step 1 — Derivatives in xe 
m-1 d an m! 


m _ m-B : 
—x =mx , —x but B tb t 
ae dx? (m= (m— Bp)! ut B may not be an integer 
B T 1 a = us 
Sn Po ee Pole op 
“ ont) qe Vie) ue 


This now generalizes for derivatives in e™ 
DY e%* = qv ex 
(v not an integer) 
Generalizations to functions that can be written in a power series: 


qd 
n 
f= > 4, +b,x 
Generalizations to functions that can be written in an exponential 


a e'’ = cos(O) + isin(@) 
t +b, 
f,(t) = Da, e ol 4 ol 
E uler ’s5 La W: Distribution A: Approved for cos(@) = — y) 
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Fractional Calculus —Main Points 


(Solution of the Diffusion Equation) ren 
Step 2 — Laplace Transform 00 
F(s)=L f=] fe “at 
0 
e“|f(th\S M <o 


which holds if 


Then: 
L’[F(s)]= f() | 
ie = = = ; 
52 M) (V7 )t? 


1 t” 
~T(B +1) ee 
0°u(x, t) _@ Ou(x, t) 
ot 


Step 3 - Diffusion Equation: 5 
» OX 
U(x, s) = L[u(x,t)]= | e “u(x, t)dt 
2 
10U _, 


Cu 1 O’u : 1 
ni ~ = 54) =su(xs)- f-4 
Ot a Ox a” Ox 
Ee i i i ae ees 
U(x,s) = Ae’ + Be = ——~ [ef (r)dr 
QJs5 
Poul ~—_{_—- ae _-(x-r)y’ ™ 
| 
1 u(x,t) = fe ‘tf (r)dt 
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Summary and Conclusions 


. Investigation of flow performance of 
networks. Has initialed these efforts. 
Networks in nature have excellent flow 
performance and are very robust. 


. We look at fractal structures for insight 
into this problem. 


. Related to fractals is the Fractional 
calculus, which explains the scale free 
effects seen in nature in a mathematical 
sense. 


APPENDIX C 


Sparse Statistical Data Analysis Based on the Lj=norm = Case Study 
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Outline 
Sparse statistical data analysis based on 


the Li-norm - case study Background — communication channel characterization 
The basic idea - non parametric (kernel density estimator) 
Kernel selection — Slepian function 


Bandwitth selection — L, optimization 


Xiaoping A. Shen P 
eee Numerical examples. 


‘ttp://www.ohio.edu/people/shenx 
Department of Mathematics 
Ohio University 
Athens, Ohio 45701 


Sponsored 
OHIO AFOSR FA8650-08-D-6801 <= OHIO 


Basics of channel characterization 


— identify probability density function of signal envelope Estimation'of fading envelopes 


Fading Parametric estimation Fading envelope usually assumed to follow certain 
envelope probability distributions. Such fading models include: 


* Rayleigh Distribution 


rar(t) Piast 


yy 


fv) -Zrenf- é 
= 


| for0<r<o 
2c? 


0 forr<0 


+ Rician Distribution: 


ay= Zena - a a AN(4) for A4>0,r>0 
o 2 a 


a 


forr <0 


OHIO OHIO 


An example of no-Line-of-sight propagation communication 


The pdf of Rayleigh distribution and its spectral 
channel -- Manhattan 


Rayleigh 
fading 
channel 


OHIO =~ OHIO 
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A 4 4 A eeie etwork Centric \Warfare- Di | Battlefield Instrumentation 
An example of line of sight propagation communication 


channel -- Alaska * Technically NCW will present unseen communications and analysis 


challenges and complexities. 


Instrumentation must be equally as robust to allow the capture of the 
data at each node in the battlefield network in support of N analysis 
development, testir mprovement 


Infantry 
Fight = z -— “Robotic 
Vehicle ‘Sensor 


OHIO 


Challenge in NCW Data collection methods and processing Partial list of radios in the battle environment 


SINCGARS > 30-88 MHZ - 16 KBPS digital voice, data 75 BPS & 16 
— Data manually colleted to create final reports KBPS 


Have Quick Il (UHFAM/FM/PSK) > 225-400 MHZ - Analog & 16 KBPS 


— Limited Real-time feedback 

— How many nodes need to be monitored to have a large 
enough sample of the data? 

— Where do you monitor -TOC, Relays, Weapons (Radio, 
INC, LAN), UAV’s ? 


digital voice, data 75 BPS to 16 KBPS 


EPLRS > 420-450 MHZ. Data mode 57 KBPS VHSIC SIP and 228 KBPS 
VECP 


Wideband Networking Waveform > Expected 2 MHZ to 2 GHZ up to 5 
MBPS networked throughput 


LINK 16 > 960 to 1215 MHZ. - voice 2.4 and 16 KBPS, data 28.8 KBPS to 
1.137 MBPS 


LINK 114/TADIL > 2-30 MHZ and 225-400 MHZ - data 1364 and 2250 


BPS. VHF FM - - 30-88 MHZ - analog and digital voice at 16 KBPS 
VHF ATC Data Link > 2-30 MHZ. — analog, voice & data at 300, 600, 
1200 & 1800 BPS 


VHF/UHF FM LMR > 136-174 MHZ, 380-512MHZ, and 764-869 MHZ 


OHIO onppaleg & digital voice at 16 KBPS 


Partial list of radios in the battle environment Channel characterization in NCW 


Soldier Radio and WLAN > 1.755-1.850 GHZ. - digital 16 KBPS voice, 
data at 1 MBPS. 

MUOS > 240-320 MHZ - data at 2.4, 9.6, 16, 32 and 64 KBPS. 
Cellular Radio > 824-894 MHZ, 890-960 MHZ and 1850-1990 MHZ - 
voice and data at 10 KBPS nominal, 3G rates up to 144/384 KBPS and 
2 MBPS. 


Mobile Satellite Service (MSS) > 1.61-2 GHZ - digital voice at 2.4 to 
9.6 KBPS. 


* Relay on infrequently transmitted channel state information 
(CSI) feedback data 


* Historical data may not be available 
- for a specific location to be analyzed; 
- extensive relevant for all stochastic variables. 


Integrated Broadcast Service-Module (IBS-M) > 225-400 MHZ - data 
at 2.4, 4.8, 9.6 and 19.2 KBPS. 

VHF ATC Data Link (NEXCOM) > 118-137 MHZ - digital voice at 4.8 
KBPS, data at 31.5 KBPS. 

UHF AM/FM PSK > 225-400MHZ and 225-450 MHZ — analog, digital 
voice and data rates up to 16 KBPS. 

Link 4A > 225-400 MHZ. - data at 5 KBPS. 


The typical situation 


* available, relevant data are too sparse to provide a good basis 
for probability assessment 


* parametric estimation is not always valid. The rigid 
distributional assumptions constrain the data to fall in a 


Link 11B > 225-400 MHZ - data at 600, 1200, and 2400 BPS. given parametric family. 


DWTS > 1350-1850 MHZ — analog, digital voice and data at rates of 
144, 256, 288, 512, 576, 1024, 1152, 1544, 2048 and 2304 KBPS. 


OHIO 


* heavy noise 
OHIO 
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The idea — kernel density estimation 


* choose kernel with good sampling property and 
* use L, measure control bandwidth selection 


Kernel density estimation based on Slepian function 


Advantages: 
Higher convergence rate 
comparing to non wavelet density estimators; 
comparing to positive density estimators (such Gaussian) 


* Avoiding Gibbs phenomenon comparing to non positive 
density estimators. 


Remark: 
The kernel is positive definite kernel 
It is not positive kernel 


It is locally positive (this improve the convergence rate 
while positive interval can be chosen to meet a particular 
need.) 


OHIO 


The energy concentration problem 


Consider the function space B: (the Paley - Wiener space) 


QI. How large @*(f;7)can be ? 


Q2. Which © bandlimited function possesses the largest 
time concentration index? 

Mathematically, find the solution to the optimization 
problem: 


max a’ (rt; f) 
feB; 
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Examples of kernel density estimation 


* The naive estimator: 


a 00 1 N 
Fu = [an YE* B= LY An(Xi) 
i=l 


where 


q(x, ¥) = 4(y-|x1) 


* Orthogonal kernel estimator: Replace 7 by reproducing 
kernel of an orthogonal system (for example, father 
wavelet) 


Am (XY) = Sy o(2" x —n)p(2” y—n) 


The energy concentration problem 


(Slepian, Landau, and Pollak at Bell Lab, 1960s) 


¢ Define 


tL FOL at ©) Pm) ) di 
ar peal TOre and 6?(0;f)= bal F@)l do 
20 as 2 
Esa 6 Eas 
Time concentration index frequency concentration index 


0<a°(x), B(c)<1 


Solution of the concentration problem 
-- Slepian’s lucky accident 


* The energy concentration problem can be rewritten as the 
eigenvalue problem of an integral operator, 


S,-(P)(D) = AGO), 


where $, (AO = f° 6), c= or 
=o t—% 
* The integral operator is commute with a second order 
differential operator 


S,-P.; = Ps,8 


Ce cays iF OF 


where P, ,(@)(t):=(1-?°) cf -2 oi 
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Prolate functions {¢,,,.,(4)};,-0 are the solution to the 


concentration problem. 


Their associated eigenvalues are the corresponding time 


concentration index. 


Gee Cee eee Gee 
hee Pee 


\ 
By? th © ty Pao A, Sax 


The kernel and its associated Slepian function 
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Properties of PSWFs 


“Step function” property of the associated 
eigenvalues 0 <6, ¢ <1, there exits y, such that 
for ot >y,4,<é€ whenk22(1+d)or, A, >1l-eé 


when k < 2(1—d)or, 


Aor Aart 


Close to 1 close to 0 


Truncated Reproducing kernel 


The Slepian kernel, Bilateral kernel, and Gaussian kernel 
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Computational matters — the estimator is simply a 
Properties convolution of two sequences 


* The linear kernel density estimator is defined as 
1. The kernel X,,,(x,7) is positive definite convolution kernel i< 
ae atid p'mv(X)=—) Ky (x,X;), 
satisfying N 
where kel 


Ky (X50) =K,y(x-1,0) > 0 K(at)= 5 Salt = 02") 
= 7,07 


for |ise=t eae ; 
2 + Its truncated version can expressed as: 


where 


2. The density estimator is asymptotically unbiased DP! mN,Ky.Ks (X) =n iby 


N —m 
By Pacer e 


0,2" 2,7 


Ky 
Che ne Oi C=) C= 
PINGS meal ) Ei 21970," x,c(0) 


The eruption length (in minutes) of the Old Faithful geyser at 
Yellowstone National Park (top) and its histogram estimator 
(bottom). 


The wavelet kernel estimator (Db4) (top) and associated 
positive kernel estimator (bottom) 


0.835830, 


08 ha a 


+ yt 


0.946797, 


Density estimate for the Old Faithful geyser data by using the 
Slepian semi-wavelet kernel, window width 0.5 (left) and Kernel density estimation using Gaussian kernel 
window width 0.25 (right). 


ty function 
B 
Density function 


Densit 
= 


85 
Distribution A: Approved for public release; distribution unlimited. 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


3/15/2011 


References 


Density estimation for Rayleigh random sample (size=100) 
Slepian, D.; Pollak, H. O., (1961), Prolate spheroidal wave functions, Fourier analysis and uncertainty, 

I, The Bell System Technical Journal 40, 43-64. 

Landau, H. J. ; Pollak, H. O. (1961), Prolate spheroidal wave functions, Fourier analysis and 

uncertainty, II, The Bell System Technical Journal 40, 65-84. 

Landau, H. J. ; Pollak, H. O. (1962), Prolate spheroidal wave functions, Fourier analysis and 

uncertainty, III, The Bell System Technical Journal 41, 1295-1336. 

Slepian, D. (1964), Prolate spheroidal wave functions, Fourier analysis and uncertainty, IV, The Bell 

System Technical Journal 43, 3009-3058. 

Slepian, D, (1983), Some comments on Fourier analysis, uncertainty, and modeling, SIAM Review 


Courant, R.; Hilbert, D., (1953) Methods of mathematical physies, New York, Interscience Publishers, 
1953-62. 


Xiao, H; Rokhlin, V.; Yarvin, N. (2000), Prolate spheroidal waye functions, quadrature and 
interpolation, Inverse Problems 17 (2001), no. 4, 805--838. 

G. Beylkin and L. Monzon, On Generalized Gaussian Quadratures for Exponentials and their 
Applications, Preprint, Dec. 2000. 


Walter, G. G; Shen, X., (2001), Sampling with Prolate Spheroidal Functions, submitted to rhe Journal 
of Sampling Theory in Signal and Image Processing. 

Walter, G. G; Shen, X., (2002), Wavelets based on Prolate Spheroidal Functions, submitted to the 
Journal of Fourier analysis and applications. 

G. G. Walter and X. Shen. Wavelet-like behavior of Slepian functions and their use in density 
estimation. Communication in Statistics-Theory and Method, Vol. 34 (3), 687-711, 2005. 


X. Shen. Slepian semi wavelets and their use in density estimation for wireless signals, Journal of 
Mathematical Study, Vol. 40 (6), 117-131, 2007. 


Density function 


OHIO 


86 6 


Distribution A: Approved for public release; distribution unlimited. 
88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


APPENDIX 5 


Shannon Entropy and Relatives: A Brief Review and New Development 
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Shannon entropy and relatives: 
A brief review and new development 


In memory of Di. Daniel W. Repperiger 
Annie X. Shen!, Catherin R. Farris”, David J. Rieksts' and Paul R. Havig? 
!, Department of Mathematics, Ohio University, Athens, Ohio 45701 
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A search under keyword, wavelet entropy, at google.com on July 14 
returned about 1,370,000 results (0.09 seconds). The following is 
search results at citeseer 


Keywords Number of published papers (since 1948- 


Outlines 


Statistics - What happened in the past? 


Basic Mathematical Concepts — What is entropy? What 
is maximum entropy? Other relative concepts. 


The problems we are interested 
The problems with the problems 
A new conjecture 

Numerical experiments 


@ Renyi entropy 


Entropy 


34,406 


Shannon Entropy 


4423 


ApEn 


20,365 


Sample Entropy 


16,109 


Mutual Entropy 


6,472 


Entropy and 
Thermodynamics 


2,862 


Residue entropy 


3,440 


Renyi entropy 


573 


Fuzzy entropy 


1664 


Relative entropy 


29,863 


5,000 10,000 15,000 20,000 25,000 30,000 35,000 


O Fuzzy entropy 

B Entropy and 
Thermodynamics 

B Residue entropy 

@ Shannon Entropy 


0 Mutual Entropy 


O Sample Entropy 


Bi Relative entropy 
@ ApEn 


O Sample Entropy 


Citations 


0 Mutual Entropy 
@ Residue entropy 
B Entropy and 


Thermodynamics 
B Fuzzy entropy 


1960 1970 
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2000 


2010 


Basic Mathematical Concepts 


Definition 1. The entropy of a discrete random variable 
X is defined by 


A(X)=- > p(x) log p(x) 


2xEX 


Remark. (1) Entropy is always positive. (2) Entropy 
measures the uncertainty inherent in the distribution of a 
random variable. 


& OHIO 


Basic Mathematical Concepts - continue 


Definition 3. The conditional entropy H(X|Y) is defined 
as 


H(X|Y) =—- S$ p(@)H(Y|X = 2) 
re N 


Remark. The conditional entropy of Y conditional on X 
refers to the average entropy of Y conditional on the 
value of X, averaged over all possible values of X. 


S&S OHIO 


Remarks on relative entropy 


. The relative entropy is a “measure” of something like 
the distance between two different probability 
distributions! 


. Comparing two probability distributions using relative 
entropy. 


. The relative entropy is not symmetric! 


@& OHIO 


Basic Mathematical Concepts - continue 


Definition 2 The joint entropy H(X,Y) of a pair of 
discrete random variables with a joint distribution p(x,y) 
is defined as 


A(X. Y)=- x >> ple. y) log p(z. y) 


2X yoY 


The chain rule for joint entropy 


HA(X|Y) = A(X) + A(Y|X) 


& OHIO 


Basic Mathematical Concepts - continue 


Definition 4. The relative entropy between two probability 
distributions is given by 


D(p\\q) = J p(x) log“ 


p(x) 
#ox a #) 


S& OHIO 


Basic Mathematical Concepts - continue 


Definition 5. The mutual information I(X,Y) measures how 
much (on average) the realization of random variable Y 
tells us about the realization of X, i. e., how by how much 
entropy of X is reduced if we know the realization of Y. 


I(X;Y) = A(X) — A(X|Y) 


Remark. Mutual information is of the most important 
information concepts for biology. For example, 


& OHIO 
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Example 


The mutual information between a cue and the 
environment indicates us how much on average the cue 
tells us about the environment. The mutual information 
between a spike train and a sensor input tells us how much 
the spike train tells us abut the sensory input. If the cue is 
perfectly informative - if it tells us everything between cue 
and environment if simply the entropy o the environment: 


I(X;Y) = A(X) — A(X|Y) = A(X) — A(X|X) = A(X) 


Remarks. (1) The mutual information between a random 
variable and itself is simply its entropy. 


(2) Mutual information is symmetric; X tells exactly as much 
about Y as Y tells us about X. 


& OHIO 


Progress of the project 


* Current status of the project — solving problem 1. 
* Conjecture and theme. 


We believe there is not a single parameter, for example, 
the following parameters are the most popular ones: 


(1) Lyapunov exponent 

(2) entropy — uncertainty 

(3) Hurst — long range memory (global property) 
(4) Holder- continuity (local property) 


Our goal it to build a multiparametric model 


S&S OHIO 


About Hurst exponent and fractional Brownian motion 


Hurst exponent H is named the Hurst parameter by 
Mandelbrot in honor of both Harold Edwin Hurst and 
Ludwig Otto Holder [3] 


the fractional Brownian motion (fBm) has become widely 
popular in a theoretical context as 


compute Hurst exponents. The numerical experiments in 
this project are based on the built in 


programs in Matlab 
Fractional Brownian motion 


& OHIO 


The problems 


Problem 1. Find an effective model of monitoring network 
volatility and uncertainty using the concept of entropy. 


Problem 2. Quantify the volatility and uncertainty. Namely, 
numerically and timely detect and identify network volatility 
and uncertainty. 


Problem 3. Build a multiscale self-adaptive model for 
forecasting based available data. 


& OHIO 


Toward the goal of multiparametric model 


Discover relations among parameters - Numerical 
examples to illustrate the relations between approximate 
entropy (ApEn) and parameter Hurst. 


Before we show the numerical results, we recall the 
¢ definition for Hurst exponent or parameter and 

* the concept ApEn; and 

« Why we use ApEn (not entropy) 


S& OHIO 


Approximate Entropy (ApEn) 


Suppose there is a sequence Sy = (S(1), S(2),...S(N)). The 
variables m used to represent the length of the 
subsequence (patterns) to be compared. Let p,,(i) denote 
subsequence starting at S(i) of length m. Further, the set of 
all subsequences of length m is P,, = { p,,(1), Pm(2); Pm(3); 
oo 9 (a((Nl—= lan => 1) 
(1) r - measure the similarity of two subsequences. In 
particular, the subsequences pm(i) and pm(j) are similar if, | 
S(i + k) - SG. +k)| <1r,500<sk<m 
(2) C,,,(1) = n,,,(r)/ (N — m + 1) can be thought of as the 
percentage of patterns in P,, that are similar to p,,(i) 


Define: ApEn(SN, m, r) = In(Cm(r)/ Cm + 1(r)). 


& OHIO 
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Approximate entropy (ApEn) Numerical simulations 


A technique that was developed for determining the Notes: 
predictability of a sequence, such as a time series. 1. Fractional Brownian motion is simulated using matlab 


2. A matlab program (modified program from matlabl 
ApEn central) used to compute the ApEn 


Randomness 


The more the repeated patterns the time series has, the 
lower its ApEn would be. 


The less the patterns the time series has, the larger the 
ApEn would be. 


S&S OHIO & OHIO 


Pseudo-code for ApEn . 
Relation between Shannon entropy and Hurst exponent 


(A matlab program written by Avinash Parnandi (Integrate Monte Carlo and bootstrap technique) 
and is available at MatlabCentral is based on 

these codes) 

INPUT 

n - the length of the pattern (template) 

r - the similarity measure (matching tolerance), 

a- the input data vector. 

OUTPUT  apen is the approximate entropy of the 

sequence given the input parameters. 

continue 


& OHIO S& OHIO 


Relation of the parameter r of ApEn and Hurst 
Summary and future work 

parameter 

Summary. In the talk, we presented some initial study to 

reveal relations among the most important parameters 

(illustreated by using ApEn and Hurst). These efforts are 

made to solve the problems 1. 

Future work — Quantify the volatility and uncertainty. 

Namely, numerically and timely detect and identify network 

volatility and uncertainty. 

High goal. Build a multiscale self-adaptive model for 

monitoring and forecasting based available data. 


Approximate Entropy 


& OHIO & OHIO 
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APPENDIX E 


Digital Signal Representation with Slepian Series 
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Outline 


Digital signal representation with 


Slepian series Introduction to Energy concentration problem and its 
solution 
-Slepian functions 
An hierarchical system system 
Xiaoping A. Shen Properties of the system. 
Numerical examples. 


http://www.ohio.edu/people/shenx 
Department of Mathematics References 
Ohio University 


Athens, Ohio 45701 


Sponsored by ONR and 
AFOSR FA8650-08-D-6801 


F Examples - Gaussian kernel and bilateral kernel 
The Paley-Wiener theorem 


Assume that / € [?(R) is analytic. If there are positive 


constants K and o so that for all z eC, 
| A(z) |< Kexp(o|z)), o) 
then the Fourier transform of h, heP (-o,o) and 
hz)=[" A(x)exp(—2mixz)dx — (***) 
A function satisfies (*) is said to be exponential type. 
A function satisfies (**) is said to be o — bandlimited 
B; = collection of all analytic function of exponential 


type (or o bandlimited functions) in L* 


OHIO OHIO 


The sinc function Shannon Sampling Theorem 


Iff is Z bandlimited (analytic of exponential type), then the 
formula: 
sinz(x—k) 


f= Vo f® 


m(x—k) 


is exact. 
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The restriction of the Heisenberg uncertainty principle 


(Harmonic analysis) A non-trivial function can not have time 
limiting and frequency limiting properties simultaneously. 


2ot — Theory 


When oz is large, the space of signals approximately, 


of duration z and bandwidth o 


has dimension 207 |approximately! 


The energy concentration problem 
Consider the function space B: (the Paley - Wiener space) 
Ql. How large q?(f;r) can be ? 


Q2. Which o¢ bandlimited function possesses the largest time 
concentration index? 


Mathematically, find the solution to the optimization problem: 


max a’ (rt; f) 
feB; 


OHIO 


Prolate functions {¢,,,  (¢)}}- are the solution to the 
concentration problem. 


Their associated eigenvalues are the corresponding time 
concentration index. 


Gee Ob cer soe Oh erate 


ee le 
Ay 2 oy 2 ay Sond A, Sa SU 
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The energy concentration problem (Slepian, Landau, 


Pollak at Bell Lab, 1960s) 


Define 


tT) £2 ates 2 
di 2 
Pe p- eS at Pipa eee 
all ONE ae [2.1 f(@) |? do 
Time concentration index 


0<a°(z), f(c)<1 


frequency concentration index 


Solution of the concentration problem 
-- Slepian’s lucky accident 


The energy concentration problem can be rewritten as the 
eigenvalue problem of an integral operator, 
So..(PO = 490), 
where S,.(A)(0) =f (x) 9 a, c=ot 
= aes 

The integral operator is commute with a second order 
differential operator 

So Po. =P, 


a 
dt? dt 


See 
i 


where P, .(9)(t) = (1-77) yh er 


The solution of the concentration problem 


Mote specifically, 

%o¢(t) has maximum time concentration index, its 
associated eigenvalue is the possible largest concentration 
index, that is 

: Die 
Aoar = UE Fo,0,2) = mata’ (7, f) 

%;..(t) is the function with maximum time concentration 
index among all the functions that orthogonal to ¢,_,,.(0) 
with its associated eigenvalue 


Aor =a(t; Por) 
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Prolate (Slepian) functions Ce A Fourier pair of pi bandlimited PSWF 


LPOTDSREES OULHSNWARE Properties of PSWFs -continue 


3. Duality of Fourier Transform. 
1. Double orthogonality 


” 2at 


e Pn.,2, (@) = 
Fi rece, Ot.o.2, (dt = 5(n- k) Anas 


Fi Grere, Or one, dt = 2,6(n- k) 


Orae(—)ho(@) 
oOo 


2. Discrete orthogonality 


Ft) =f D> enlb) F(R) }en(0)- 


n=0 k=—oo 


A Fourier pair of pi bandlimited 
Slepian function 


3. {¢,0:} is an orthogonal basis of B;, and L*[-r,r] 


OHIO 


Properties of PSWFs - continue Step function property 


4. “Step function” property of the associated eigenvalues 


For each 0 <0, ¢ <1, there exits v, such that 
for ot >y, 4, <é whenk22(1+d)or, A, >1-e 
when k < 2(1—d)or, 


Ay By axe Arar Aor xi 


Close to 1 i close to 0 
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Citation Slepian’s paper 


Yeor of Publigetion of Cited Ar 


OHIO 


The problem we wish to solve: 


Assume a piecewise band-limited signal fis given by its equally spaced samples 


1 
fos ae ; ; 
The data may be damaged by uniformly distributed noise. That is, what we 
have is 


ie ra aM 
where isfy tpidated yandom error. 

We wish to develop a robotic algorithm to represent the given signal in an 
effective gay= One 


OHIO 


The hierarchical system 


Definition (The hierarchical basis on [-1,1]) Consider the decomposition 


[-1=U) 1+ at atte h ices me 


We define 
(Iy"*2" 6, (2"x—2k-+ 2" #1); 


bat )=4 py 


-1+ a <x<-ltshm SO i= Ole =P, con 


m : the level index 

n:the label of nth eigenfunct ions 

k : the piecewise index 

c: the bandwidth of the original Slepian functions ¢,, , ,(t) 


Pro X50) = Pp, o COX) 


OHIO 
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Motivation 


Recall spectral approximations 
For periodic problems: Fourier approximation 


For non-periodic problems: Chebyshev, Legendre 
approximation 


For analytic functions: Both approximation assume 
exponential accuracy 


For piecewise analytic functions, 


O(N) globally 
O(1) at near discontinuity — known as Gibbs phenomenon 


OHIO 


Piecewise band-limited functions 


Definition (Piecewise band-limited function on [a, b]) 


Let f </*[a,5] If there exists a partition [a,6]=U'[a,.4,..] 
such that 
FOX) = Fi), ¥€[4).4)1),4 = 0,....7-1 


where the analytical extension F,, associated with /,, belong 


to a Paley-Wiener space B?,i=0,....2-1 


Remark Notice that, we have 


f, B20 = max {o,} 


Osisn-1 


OHIO 


Examples of the system 


OHIO 
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Another example Structure of the binary system 


OHIO OHIO 


The hierarchical system Raised cosine function 


The function defined as 


_ ste — fe + apecnateG + AL 


(eal k 


Gua (HEX EL -14 sat as = Tyng™ 


ge 


®,,,0%5¢) = 


is a piecewise band-limited function with each piece has same approximately 
bandwidth am 
€ 


Denote 


V,, = spant@,,,,(x5¢),/ =0,1,2,...,, k = 1,2,...,2” n} 
The space consists of piecewise band-limited functions with dimension 
a 


2"n 


OHIO =e OHIO 


Example. A signal with Gaussion noise The problem with discontinuity 


Raised Cosine and its SPWF expansion, c=10, k=4 


Thus, in approximating a piece of a non analytic function by 
band-limited functions, we exchange goodness of fit in 
analytic intervals with wildness of behavior outside this 
interval 


600 1000 1500 2000 2500 3000 3500 40 


— Slepian 


Expansion of Raised Consine with added Gaussian noise 


¥ wine 


fi 
iil 
i 
er" 


500 1000 1600 2000 2500 3000 3600 4000 4500 


Hu 
"iy 
Ah m 
ht 
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Orthogonal expansion using Slepian functions Orthogonal expansion using Slepian functions 
— A piecewise band-limited function — Heaviside function 


+ + 
ae 
¥ 


N10 4 
e4s6.c08.N ; PSWF coatfiients 


ht ha 
Pettey, 

+ * 

+ 

ee 

orterianets PSWF coaffilents 


aT) 


OHIO OHIO 


Orthogonal expansion using Slepian functions 


— Gaussian kernel «=! Concluding remarks 


ee 
In this research, we are looking for a better representation (or model) for 


discretized piecewise band-limited functions with attempting to recover 
this type of signal via their spectral expansion coefficients effectively. 


A hierarchical system based on Slepian functions are introduced. 


The system has some peculiar properties which are useful in signal 
analysis for a particular type of signals. 


The system shows promising potential in certain applications, such as 
noised signal representation, segmentation, detection, etc. 


OHIO OHIO 
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Complexity Analysis and Information Visualization 
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Complexity Analysis and Information Eyerviow 


Visualization 
Air Force Research Lab 
March 5, 2010 Complex Networks Overview 
Information Visualization Overview 
Air Force Research Lab Where we are now. Where we are headed. 
Wright-Patterson AFB, Ohio Faculty and Internship Opportunities 
Summary 
Paul R. Havig, PhD, Questions 
and Kate A. Farris 
711% Human Performance Wing 
Air Force Research Laboratory 


NN , 
A Air Force Research Laboratory 


*Technology Directorates 
+ Air Vehicles 
+ Directed Energy 
+ 711 Human Performance Wing 
+ Information 


+ Materials and Manufacturing . 6 Locations 


+ Munitions + Washington DC 


+ Mesa, AZ 

+ San Antonio, TX 
« Rome, NY 

+ Dayton, OH 

+ Albuquerque, NM 


+ Propulsion 
+ Sensors 
+ Space Vehicles 


+ Air Force Office of Scientific Research 


NN 


, r 
+d Complex Networks Ar: Objective 


* “More studies that assess the effectiveness of such * Objective: To develop a concrete, empirically validated 
[network] approximations — and provide concrete, “measuring stick” that can better asses the performance 
empirically validated guidelines for practice within to vulnerability tradeoff of a computer network. 
particular problem domains — would be a welcome 
addition to the literature.” : 

+ Four fundamental assumptions: 


AN J 1. Assess what it is that is interacting 
— Carter T. Butts, Revisiting the Foundations of Network Analysis, 


Science, Vol 325, 24 July 2009 . Qualify the nature of the interaction 
Quantify the time scale on which that interaction takes place 


Is the network static or dynamic? 
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a 3 ; : r X J : : 
4 Underlying Assumptions Ar: Td Fusing Three Fields 


Exactly what is interacting? + Characterize network performance and vulnerability 
Nodes and Links through information, optimization and graph theories. 


Node = Webpage As 


s 
s 


Link = Transmission Between Websites 
Information 


(accomplished via wires, cables and wireless access) Theory 


. Nature of interaction? 
a. Research, E-mail, Blogging, etc. 

. Time scale which interaction occurs? i ~-e a7 
a. Milliseconds ~ "Optimization i 

. Static or Dynamic? Theory 


a. Dynamic Network, constantly changing * 


Combining Three Major Disciplines 


\7 Graph Theory 


+ Network (graph) is a set of nodes connected by edges * Basic Graph Theory Concepts 
* Nodes: Vertices D, I, S and T * Digraph (Directed Graph) 
* Edges: ID, DS, SD, IT, SI, ST and TI 


ae, 


(12) 
8 
1 & 


\ 


tnsinte 


cutset 


‘Simple Combat Network 
Network with Cut Set and Flow/Capacities Measures 


J. Cares, Distributed Network Operations: The Foundations of i . | 
Network Centric Warfare Ye Visual rendering designed by D. W. Repperger, 1942-2010 


& r 
YZ Information Theory ’r- Mutual Information 


* Information Channel: 


+ Variables: 
H(«/ y) = Lost 1. H(x) - input uncertainty 


Information = 


Mutual Information = (xy) is the reduction in uncertainty ot 
Equivocation 


H(y) - output Ba == an input Sbject By observing-an output object 
uncertainty : [ 


Input Tox) = Transmitted output 
Information Information Information }.  H(x/y) - equivocation 


Hog = Hy) = 


lost to environment 


. _ H(y/x) — spurious 
Hiyhx) = Noise or uncertainty 

Spurious Information 

Kx;y) - mutual 
information 


Basic Elements of the Information Channel transmission 


T. B. Sheridan and W. R. Ferrell, Man-Machine Systems: Information, Distinguishing Mutual Information (x;y) from the Uncertainty Variables H(.) 
Control, and Decision Models of Human Performance, The MIT press, 
Cambridge, Massachusetts, 1981. 
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Final Picture 


* Converting a Complex Network into a framework that is 
possible to use information-theoretic methods to 
calculate mean flows that occur within an ellipse. 


to te network 
-_ 


Framework for Combining Graph Theory with information Theory 


Visual rendering designed by D. W. Repperger, 1942-2010 


Introduction 


* Main points 
— 3D (metrics) 
— Interaction (tangible user interfaces) 
— Information Visualization (new directions) 


3D Metrics 


‘Al 3 5 
Standard Better No difference New display superior 


Using the scale above rate the new display for the following questions: 


Which display looks sharper? 

Which display has brighter colors? 

Which display is more comfortable to view? 
Which display do you prefer? 


2/19/2012 


\ Where we are now. Where we’re 
y¥ headed. 


¢ Where we are now: 


— Objective: To develop a concrete, empirically validated 
“measuring stick” that can better asses the performance to 
vulnerability tradeoff of a computer network. 


— Theoretical framework has been developed 


« Where we are headed: 
— Run JAVA-based computer simulations 
— Compare results of simulations against theoretical framework 
— Re-test simulations if necessary 


3D Metrics 


+ Why metrics? 
— Not apples to oranges 


+ A taxonomy of metrics 
— Subjective 
— Subjective-Objective (Performance) 
— Objective 


3D Metrics 


+ Performance 
— Why buy these expensive “toys”? 
— When to use? (Should we use?) 
+ Issues (e.g., health) 
+ Why not just use a PS3? 
— Is 2.5D good enough? 
— Can we make a style guide? 
+ AFOSR research 
+ Killer apps? 
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3D Metrics F \ Se 3D Metrics 
~ 


OIE 1976 


* Objective metrics 
— System 
+ Find it on a spec sheet 
— Measurable 
+ Find/verify in the laboratory 


a Perspecta Lights On 
|—m— Perspecta Lights Off 


3D M i 6 .s) 
etrics hd 


CIE 1976 


Tangible User Interfaces 


* Ok for a 2D environment (Word, Excel, etc.) 
* Not for a 3D (nor 2.5D) (Gaming, CAD, etc.) 
— 2D #3D! 
* Tangible User Interfaces (TUls) 
— 6 degrees of freedom (DOF) 
— Spatial versus desktop TUls 
+ Is the keyboard and mouse that bad? 
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Md Conclusions and Future Directions 


+ What we have 
— 3D does help 
+ Where is the killer app? 
— TUls are needed 
+ But...keyboards are quite helpful! 


* Where do we want to go now 
— Deep dives and blue skies! 
— Laboratory upgrade 


Key Players 
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Keyboard and Mouse 


Years of training with mouse to. 
control applications. 

For 2D tasks it is great. 

Little to no additional training 
required. 


For 3D tasks involving 6D motion, 
additional keyboard keys needed. 
Requires two hands. 

Basic mouse is only 2 DOF. 

May not be intuitive to new users. 


Sandio Tech 3D Mouse 


Inherits the things mice are good 
for. 

The extra controls take only 5-10 
minutes to understand. 

6 DOF with one hand. 


It was easy to rotate or translate 
the reverse directions, so some 
confusability between directions. 
Unintuitive. 

Cannot vary speed of motion. 


3D Connections 


Only a single knob to manipulate. 
Once learned, is nice for quick 
zooming in/out or moving viewpoint 
from one location to another. 

6 DOF with one hand. 


Troubles with cross talk between 
pushing left / right and tilting left / 
right. 

Getting “lost” or dis-oriented, 

Not entirely intuitive. 

Co-axial rotation and translation 
can bleed without a proper dead- 
zone and careful device 
manipulation. 


In-House TUI 


Very intuitive mapping 
Rotation still a problem; dificult to 
return to “home”; might want a 
“standard axis” around which to 
rotate that can be toggled on/off 
(.e., North/South) 


Tangible, real-world motion capture |° 


Only partially mapped inputs, so 
cannot control everything like the 
other input devices. 

Utilizes a magnetic field to function. 
Device is sensitive to fields created 
by ferrous metals and other nag 
electronic devices. 

Short range - Tracker must 

within 1 meter of emitter to get 
accurate readings. 


Relationship Between Fields 


Information 
Visualization 


Faculty and Internship 


INFORMATION 
VISUALIZATION 


PER- 
CEPTION, 
DESIGN & 
VISUAL 
COMMUNI- 
CATION 


KNOWLEDGE 
VISUALIZATION 
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Opportunities 
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88ABW/PA Cleared 08/20/2013; 88ABW-2013-3715. 


« Summer Faculty Fellowship Program (SFFP) 
« Student Temporary Employment Program (STEP) 


2/19/2012 


Summary d ®S : Questions??? 


* Basic research in complex network dynamics 


* Building research in information visualization 


* Connecting math to pictures * Contact Info: Paul.Havig@wpaftb.af.mil 


Katheryn.Farris@wpafb.af.mil 
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APPENDIX G 


Visualizing Complexity of Sunspot Cycles Via Wavelet Transform 
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In memowy of Dt. Daniel UW) Reppewmer 


Abstract. Dark spots, some as large as 50,000 miles in diameter, move across the surface of the sun, contracting and expanding as they go. These strange and powerful phenomena 
are known as sunspots [1] (See Figure 1). Sunspots generate magnetic regions on the Sun with strengths thousands of times stronger than the Earth's magnetic field, which can 
physically damage satellites and pose a threat to astronauts. This study employed wavelet analysis, a powerful mathematical tool, to reveal and visualize the fractal patterns of sunspot 
data. The long term goal of this research is to discover the underlying mathematical theory behind the data analysis and visualization associated with human-machine 
integrated complex networks to improve human effectiveness and performance capability via effective information visualization. 


1. Background. 


Wavelet analysis is a new 
mathematical tool which 
can be used to extract 


Information from _ data, Figure 1. Sunspots and the region around sunspots 
such as audio signals and (Courtesy National Solar Observatory). 


images. AS opposed to Fourier bases, wavelet bases are localized 
with multiscale. The fast Wavelet Transform (FWT) is faster than 
the fast Fourier transform (FFT). Wavelet filters can be finite impulse 
response (FIR) or infinite impulse response (IIR). Mathematically, 
just like a tuning fork physically resonates with a signal of its specific 
tuning frequency, the wavelet filters will resonate if the unknown 


signal contains information of similar Trequency see 2 and the 
references there In). . , 


Figure 2 shows three 
different types of wavelet 
families — the first two are 
FIR and the last one Is IIR. 


2. Initial Study. Figure 2. — From left: father and mother of Coiflets 
(coif3), Daubechies (db4), and the Meyer wavelet 
2.1 Data description. The International Sunspot Number (ISSN) Is 
compiled by the Solar Influences Data Analysis Center in Belgium. 
NOAA sunspot number is compiled by the US National Oceanic and 
Atmospheric Administration. The data are the monthly ISSN averages 
and standard deviation derived from ISSN from January 1749 to June 
2010 (a total of 3138 months). 
2.2 Basic statistical properties. Basic statistical properties of the 
data are shown in Figure 3. The descriptive statistics show that the 
average sunspots number is 51.9733 per month with standard 
deviation (std) 44.3406. The largest sunspot number, 253, occurred 


a 3. “Wenttity average sunspot numbers. 1749 — 2010 (left). Frequency 
histogram (middle) and normality test. 
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In October 1957. The 75th percentile of the sunspot number is 76.6. Recently, from 2000-2010, the mean Is 
50.4153 with std=46.0629. Since 2000, the largest sunspot number, 170, occurred in July 2000. As the 
histogram (middle panel) suggested, the normality test indicates (right panel) that the data does not follow any 
normal distribution. The statistical findings and observations lead to the following 


Conjectures. 1. Sunspot number is a random variable following a long tailed - 
distribution. 2. Sunspot number has fractal structure with self similarity. 


A parametric test Indicates that the Sunspot number possesses the Hurst =: 
number 0.27 approximately (below normal human Heartbeat interval EEG ° 
data). We then apply wavelet decomposition (as multiple lenses using ° a ala 
Coiflets) to see the signal at different resolutions and better understand the data _—“ Figure 4. Four level wavelet 
structure (results shown in Figure 4). Motivated by the initial study, the following ¢ecomposttion. 

method Is designed to provide visual evidences for the conjectures. 


3. The Method. Step 1. Develop reference signals. The 
selected reference signals are, the famous fractal curve - von 
Koch curve (Hurst exponent H=1.7381 and the _ fractional 
dimension D=1.2619) and an electric consumption signal which 
does not have fractal structure (see Figure 5). Step 2. Perform 
continuous wavelet transform (CWT) with the 1-D graphical tool 
in Matlab Wavelet Toolbox®. The wavelet coefficients of the Koch 
curve show a repeating pattern on many scales (see figure 4, the 
vertical axis is scale measure) while the coefficients of the other & : 
Signal does not. Step 3. Perform CWT of sunspots as described Figure 5 CWT of Koch curve (top) and an electrical 
for the reference signals. Results are shown in Figure 6, from left 3000.s000) using Coiflats (resolution level 8) 
top corner, the CWT based on Daubechies wavelet (N=8), the 

CWT based on Coelflets and zoom in pictures. The graphical 
results suggest that the conjecture holds for the current sunspot 
data. It is worth mentioning that Wavelet decomposition Is very 
well adapted to the study of the fractal properties of signals and 
images. In particular, when the characteristics of a fractal evolve 
with time and become local (multifractal), wavelets are especially 
Suitable tool for practical analysis and generation. Readers are 
referred to reference [3] for more details of the study. 


References Figure 6. CWT of sunspot data. 


[1] National Aeronautics and Space Administration, http://solarscience.msfc.nasa.gov, date visited 08/04/2010. 

[2] G. G. Walter and X.Shen. Wavelets and Other Orthogonal Systems, 2"9 Ed., Studies in Advanced Mathematics, xx+370 pp, Chapman & Hall/CRC, Boca Raton, FL, 2001. ISBN: 1- 
5848-8227-1. 

[3] X. Shen. Visualizing information of engineered complex systems, working paper, 2010. 


Distribution A: Approved for public release; distribution unlimited. 


Sponsored by AFOSR Grant #FA8650-08-D-6801 


